Hamish wrote:
for an example of grass.start_command() for parallelizing a bunch
of r.cost runs, see v.surf.icw(.py) in grass7 addons:
https://trac.osgeo.org/grass/browser/grass-addons/grass7/vector/v.surf.icw/v.surf.icw.py
Johannes:
thank you for that example. I think it explains it very well how it
works to assign multiple r.cost runs to single processes with
grass.start_command. I am just wondering how it is done when there are
multiple consecutive processesin the for loop. In your example (v.surf.icw.py) for each step (e.g.
r.cost (line 271), r.mapcalc (298)) an separate for loop is started...Is
there a way to combine the steps etc. in a function (e.g. combination
of r.cost and mapcalc) and launch that function in a way like
grass.start_command in a single loop?
If possible that would probably save code lines and might be a little
more clear (at least to me).I am just asking because one of my skripts which is still in "serial
mode" involves lots of steps inside the for loop.This would create in parallel at least a dozen for loops which might
appear very unclear.
ok, in s.surf.icw(.sh) for GRASS 5 and v.surf.icw(.sh) for GRASS 6 I had
it as one big loop, but for the GRASS 7 python version I made it into
a series of small loops to (a) use the simpler grass_start() single command
method, and (b) get rid of the temp maps ASAP since that module makes a
lot of them and it adds a lot of disk I/O lag if they get flushed to
the hard drive before they are removed. In the icw case most of the time
was taken by r.cost compared to the renaming and preprocessing bits of
the (former) big loop.
for parallelizing an entire function in Python as you want, there's a
method in grass7's i.landsat.rgb(.py) to look at that uses mp.Process.
It's a bit more work since you have to manually ensure that the I/O pipes
get closed.
https://trac.osgeo.org/grass/browser/grass/trunk/scripts/i.landsat.rgb/i.landsat.rgb.py
note the above script preserves the serial execution method intact (to
make the imagery method easier to learn), so has ~ double the code than it
actually needs. But I think using the extra wrapper function makes the
real guts of the imagery algorithm easier to read, understand, and maintain,
and so keeping all the ugly parallelization stuff away is a good thing.
Hamish