[Please always keep the list in CC.]
On 06/02/18 22:57, Leonardo Hardtke wrote:
Hi, thanks Moritz.
I tried with your suggestion but I get the same error out...As a side note, If the process does not read any data in it works as expected (ie commenting the for loop).
Can you identify which specific call in the loop ?
Have you tried launching with
pool.map(tile_process, [1, 2]) ?
I have a similar approach working OK with plain gdal (https://gist.github.com/leohardtke/b54e79ed93546c0db840c7b5e951a6ce).
There must be something with the grass raster python module, but I can't figure it out.
Not sure if it is raster, or rather temporal dataset handling. I don't have time to look at this in detail now, so I'm putting grass-dev in CC so you might get some answers from people more knowledgeable in temporal data processing than me.
A bit more info (e.g. more details of the code, such as the definition of your pool, but also OS, versions, etc) might be helpful.
Moritz
Cheers
On 7 February 2018 at 00:47, Moritz Lennert <mlennert@club.worldonline.be <mailto:mlennert@club.worldonline.be>> wrote:
On 06/02/18 12:09, Leonardo Hardtke wrote:
Dear all,
I am working on a module to extract the phenological parameters
(like timesat) from a time series implemented in python/cython
and making use of gscript and other grass stuff.
It works great on a 256x256 and as the plan is applying it over
Australia at 250m over 17 years, I need to split the process in
small tiles. The idea is to run this processes in parallel and I
am having issues implementing it.This would be the first part of the process that runs on each tile:
def tile_process(tile_index):
'''
Function for every worker:
Applies any function to the sub_region corresponding to
the tile_index.
'''
global Rows
global Cols
global RowBlockSize
global ColBlockSize
global full_region
global dates
global years
global indices
global data_serie
global yr_limits_extra
global yr_limits
global dbifsub_name='block'
TileRow, TileCol, sr =
sub_region(tile_index,full_region,RowBlockSize,ColBlockSize)
# # Define a temporary region based on the parameters
caluculated with the
start_row = TileRow * RowBlockSize
start_col = TileCol * ColBlockSize
n_rows = sr['rows']
n_cols = sr['cols']strds = tgis.SpaceTimeRasterDataset(data_serie)
strds.select(dbif=dbif)
maps = strds.get_registered_maps_as_objects(dbif=dbif)# Numer of time steps
steps = len(maps)
# Make an empty array
#print(steps)
EVI = np.empty([steps,n_rows,n_cols])
# fill the array
for step, map in enumerate(maps):
map.select(dbif=dbif)
image_name = map.get_name()+'@'+data_serie.split('@')[1]
#print("reading: {}".format(image_name))
EVI[step,:] =
raster2numpy_sub(image_name,start_row,n_rows,start_col,n_cols)
mean = EVI.mean()
print(mean)
....and this is how I start the multiprocess pool.
pool.map(tile_process, xrange(RowBlockN*ColBlockN))
pool.close()
pool.join()and it gives me:
AssertionError: can only test a child process
of course if I do: tile_process(0) or tile_process(1) etc ,the
right result comes out.Does any of you have experience with this? Any suggestion would
be welcome!
Sorry for the messy code. Is still in early stage.Just a wild guess: have you tried with range (which returns a list)
instead of xrange (which returns an xrange object) ?Moritz
--
Dr. Leonardo A. Hardtke
C3 UTS, Scientific Officer
CB04.06.315.06
Email:leonardoandres.hardtke@uts.edu.au <mailto:leonardoandres.hardtke@uts.edu.au> orleohardtke@gmail.com <mailto:leohardtke@gmail.com>