[GRASS-dev] python: get results of r.stats call into list of lists of numericals instead of list of strings

Hi,

I need some help with python scripting. I'm running r.stats on two raster maps such as this:

         rstats_results = gscript.read_command('r.stats',
                              input_=[a, b],
                              flags='n1',
                              separator='comma',
                              quiet=True)
         results = rstats_results.splitlines()

This gives me

['123,456', '456,789', '987,321']

Does anyone know an efficient way to transform this into a list of lists of numbers, such as:

[[123, 456], [456, 789], [987, 321]]

I know I can do this as follows:

mylist =
for line in results:
  mylist.append([int(line.split(',')[0]), int(line.split(',')[1])])

But for very large raster maps this does not seem very efficient.

Is there a way of getting the output directly as a list of lists ? Or a more efficient transformation ?

Moritz

Moritz Lennert wrote:

I need some help with python scripting. I'm running r.stats on two
raster maps such as this:

         rstats_results = gscript.read_command('r.stats',
                              input_=[a, b],
                              flags='n1',
                              separator='comma',
                              quiet=True)
         results = rstats_results.splitlines()

This gives me

['123,456', '456,789', '987,321']

Does anyone know an efficient way to transform this into a list of lists
of numbers, such as:

[[123, 456], [456, 789], [987, 321]]

I know I can do this as follows:

mylist =
for line in results:
  mylist.append([int(line.split(',')[0]), int(line.split(',')[1])])

But for very large raster maps this does not seem very efficient.

Is there a way of getting the output directly as a list of lists ? Or a
more efficient transformation ?

For a start, there's no reason to call the .split() method twice:

  for line in results:
      a, b = line.split(',')
      mylist.append([int(a), int(b)])

A list comprehension might be more efficient, e.g.:

  mylist = [[int(x)
             for x in line.split(',')]
            for line in results]

Also, rather than reading the output as a string, I'd use
pipe_command() and iterate over the file:

        rstats = gscript.pipe_command('r.stats',
                             input_=[a, b],
                             flags='n1',
                             separator='comma',
                             quiet=True)
  mylist = [[int(x)
             for x in line.strip().split(',')]
            for line in rstats.stdout]

I don't think that you'll do much better than that in Python. If it's
still too slow, try using numpy.loadtxt() (pre-processing the data
with sed if that function can't read it directly).

--
Glynn Clements <glynn@gclements.plus.com>