[GRASS-user] multiprocessing in python

Dear all,

I am working on a module to extract the phenological parameters (like timesat) from a time series implemented in python/cython and making use of gscript and other grass stuff.

It works great on a 256x256 and as the plan is applying it over Australia at 250m over 17 years, I need to split the process in small tiles. The idea is to run this processes in parallel and I am having issues implementing it.

This would be the first part of the process that runs on each tile:

def tile_process(tile_index):
‘’’
Function for every worker:
Applies any function to the sub_region corresponding to the tile_index.
‘’’
global Rows
global Cols
global RowBlockSize
global ColBlockSize
global full_region
global dates
global years
global indices
global data_serie
global yr_limits_extra
global yr_limits
global dbif

sub_name=‘block’
TileRow, TileCol, sr = sub_region(tile_index,full_region,RowBlockSize,ColBlockSize)

# Define a temporary region based on the parameters caluculated with the

start_row = TileRow * RowBlockSize
start_col = TileCol * ColBlockSize
n_rows = sr[‘rows’]
n_cols = sr[‘cols’]

strds = tgis.SpaceTimeRasterDataset(data_serie)
strds.select(dbif=dbif)
maps = strds.get_registered_maps_as_objects(dbif=dbif)

Numer of time steps

steps = len(maps)

Make an empty array

#print(steps)
EVI = np.empty([steps,n_rows,n_cols])

fill the array

for step, map in enumerate(maps):
map.select(dbif=dbif)
image_name = map.get_name()+‘@’+data_serie.split(‘@’)[1]
#print(“reading: {}”.format(image_name))
EVI[step,:] = raster2numpy_sub(image_name,start_row,n_rows,start_col,n_cols)
mean = EVI.mean()
print(mean)



and this is how I start the multiprocess pool.

pool.map(tile_process, xrange(RowBlockN*ColBlockN))
pool.close()
pool.join()

and it gives me:

AssertionError: can only test a child process

of course if I do: tile_process(0) or tile_process(1) etc ,the right result comes out.

Does any of you have experience with this? Any suggestion would be welcome!
Sorry for the messy code. Is still in early stage.

···

Dr. Leonardo A. Hardtke
C3 UTS, Scientific Officer
CB04.06.315.06

Email:leonardoandres.hardtke@uts.edu.au or leohardtke@gmail.com

On 06/02/18 12:09, Leonardo Hardtke wrote:

Dear all,
I am working on a module to extract the phenological parameters (like timesat) from a time series implemented in python/cython and making use of gscript and other grass stuff.
It works great on a 256x256 and as the plan is applying it over Australia at 250m over 17 years, I need to split the process in small tiles. The idea is to run this processes in parallel and I am having issues implementing it.

This would be the first part of the process that runs on each tile:

def tile_process(tile_index):
'''
Function for every worker:
Applies any function to the sub_region corresponding to the tile_index.
'''
global Rows
global Cols
global RowBlockSize
global ColBlockSize
global full_region
global dates
global years
global indices
global data_serie
global yr_limits_extra
global yr_limits
global dbif

 sub\_name='block'
 TileRow, TileCol, sr = sub\_region\(tile\_index,full\_region,RowBlockSize,ColBlockSize\)
 \# \# Define a temporary region based on the parameters caluculated with the
 start\_row = TileRow \* RowBlockSize
 start\_col = TileCol \* ColBlockSize
 n\_rows = sr\['rows'\]
 n\_cols = sr\['cols'\]

 strds = tgis\.SpaceTimeRasterDataset\(data\_serie\)
 strds\.select\(dbif=dbif\)
 maps = strds\.get\_registered\_maps\_as\_objects\(dbif=dbif\)

 \# Numer of time steps
 steps = len\(maps\)
 \# Make an empty array
 \#print\(steps\)
 EVI = np\.empty\(\[steps,n\_rows,n\_cols\]\)
 \# fill the array
 for step, map in enumerate\(maps\):
      map\.select\(dbif=dbif\)
      image\_name = map\.get\_name\(\)\+'@'\+data\_serie\.split\('@'\)\[1\]
      \#print\("reading: \{\}"\.format\(image\_name\)\)
      EVI\[step,:\] = raster2numpy\_sub\(image\_name,start\_row,n\_rows,start\_col,n\_cols\)
 mean = EVI\.mean\(\)
 print\(mean\)
 \.\.\.\.

and this is how I start the multiprocess pool.

 pool\.map\(tile\_process, xrange\(RowBlockN\*ColBlockN\)\)
 pool\.close\(\)
 pool\.join\(\)

and it gives me:

AssertionError: can only test a child process

of course if I do: tile_process(0) or tile_process(1) etc ,the right result comes out.

Does any of you have experience with this? Any suggestion would be welcome!
Sorry for the messy code. Is still in early stage.

Just a wild guess: have you tried with range (which returns a list) instead of xrange (which returns an xrange object) ?

Moritz