[GRASS-dev] [GRASS-user] multiprocessing in python

[Please always keep the list in CC.]

On 06/02/18 22:57, Leonardo Hardtke wrote:

Hi, thanks Moritz.
I tried with your suggestion but I get the same error out...

As a side note, If the process does not read any data in it works as expected (ie commenting the for loop).

Can you identify which specific call in the loop ?

Have you tried launching with

pool.map(tile_process, [1, 2]) ?

I have a similar approach working OK with plain gdal (https://gist.github.com/leohardtke/b54e79ed93546c0db840c7b5e951a6ce).

There must be something with the grass raster python module, but I can't figure it out.

Not sure if it is raster, or rather temporal dataset handling. I don't have time to look at this in detail now, so I'm putting grass-dev in CC so you might get some answers from people more knowledgeable in temporal data processing than me.

A bit more info (e.g. more details of the code, such as the definition of your pool, but also OS, versions, etc) might be helpful.

Moritz

Cheers

On 7 February 2018 at 00:47, Moritz Lennert <mlennert@club.worldonline.be <mailto:mlennert@club.worldonline.be>> wrote:

    On 06/02/18 12:09, Leonardo Hardtke wrote:

        Dear all,
        I am working on a module to extract the phenological parameters
        (like timesat) from a time series implemented in python/cython
        and making use of gscript and other grass stuff.
        It works great on a 256x256 and as the plan is applying it over
        Australia at 250m over 17 years, I need to split the process in
        small tiles. The idea is to run this processes in parallel and I
        am having issues implementing it.

        This would be the first part of the process that runs on each tile:

        def tile_process(tile_index):
          '''
          Function for every worker:
          Applies any function to the sub_region corresponding to
        the tile_index.
          '''
          global Rows
          global Cols
          global RowBlockSize
          global ColBlockSize
          global full_region
          global dates
          global years
          global indices
          global data_serie
          global yr_limits_extra
          global yr_limits
          global dbif

          sub_name='block'
          TileRow, TileCol, sr =
        sub_region(tile_index,full_region,RowBlockSize,ColBlockSize)
          # # Define a temporary region based on the parameters
        caluculated with the
          start_row = TileRow * RowBlockSize
          start_col = TileCol * ColBlockSize
          n_rows = sr['rows']
          n_cols = sr['cols']

          strds = tgis.SpaceTimeRasterDataset(data_serie)
          strds.select(dbif=dbif)
          maps = strds.get_registered_maps_as_objects(dbif=dbif)

          # Numer of time steps
          steps = len(maps)
          # Make an empty array
          #print(steps)
          EVI = np.empty([steps,n_rows,n_cols])
          # fill the array
          for step, map in enumerate(maps):
          map.select(dbif=dbif)
          image_name = map.get_name()+'@'+data_serie.split('@')[1]
          #print("reading: {}".format(image_name))
          EVI[step,:] =
        raster2numpy_sub(image_name,start_row,n_rows,start_col,n_cols)
          mean = EVI.mean()
          print(mean)
          ....

        and this is how I start the multiprocess pool.

          pool.map(tile_process, xrange(RowBlockN*ColBlockN))
          pool.close()
          pool.join()

        and it gives me:

        AssertionError: can only test a child process

        of course if I do: tile_process(0) or tile_process(1) etc ,the
        right result comes out.

        Does any of you have experience with this? Any suggestion would
        be welcome!
        Sorry for the messy code. Is still in early stage.

    Just a wild guess: have you tried with range (which returns a list)
    instead of xrange (which returns an xrange object) ?

    Moritz

--
Dr. Leonardo A. Hardtke
C3 UTS, Scientific Officer
CB04.06.315.06
Email:leonardoandres.hardtke@uts.edu.au <mailto:leonardoandres.hardtke@uts.edu.au> orleohardtke@gmail.com <mailto:leohardtke@gmail.com>