[GRASS-user] Parallel proccess with pygrass and mapcalc

Hi I’m writing a code in python able to perform several mapcalc operation on various rasters. I’m using GRASS 7.0.1 and Python 2.7. I’have my files stored in /home/myuser/rasters/ and they are linked to GRASS with r.external. The out put of the operation are going out of GRASS in /home/myuser/tmp/ with r.external.out.

I have written this code from the example provided in (https://grass.osgeo.org/grass70/manuals/libpython/pygrass.modules.interface.html?highlight=parallelmodulequeue#pygrass.modules.interface.module.ParallelModuleQueue) but there is no difference in terms of time between setting nproces=1 or nproces=8. How is possible? How can I achieve improvements?

The task that the code need to do is to read data from a file containing datetime and 2 coefficients k1 and k2. Than need to multiply and sum the input raster as out = input1 * k1 + input2 * k2.

The code that i have written is:

import pandas as pd
import os

import grass.script.setup as gsetup
import grass.script as g
import time
import copy
from grass.pygrass.modules import Module, ParallelModuleQueue

def main():

#set GRASS LOCATION AND MAPSET
gisbase = os.environ[‘GISBASE’] # Grass 7.0svn
gisdbase = os.path.abspath(“/home/myuser/grassData”)
location = ‘mylocation’ # Grass Location.
mapset = ‘mymapset’
gsetup.init(gisbase, gisdbase, location, mapset)

#READ INPUT DATA FOR RASTER CALC
df = pd.read_csv(‘input.csv’, sep=“;”, index_col=‘Date Time’, decimal=‘,’)
df.index = pd.to_datetime(df.index, unit=‘s’)

month={1:‘17’,2:‘47’,3:‘75’,4:‘105’,5:‘135’,6:‘162’,7:‘198’,8:‘228’,9:‘258’,10:‘288’,11:‘318’,12:‘344’}
hour={4:‘04’,5:‘05’,6:‘06’,7:‘07’,8:‘08’,9:‘09’,10:‘10’,11:‘11’,12:‘12’,13:‘13’,14:‘14’,15:‘15’,16:‘16’,17:‘17’,18:‘18’,19:‘19’,20:‘20’,21:‘21’,22:‘22’}
minute={0:‘00’,15:‘15’,30:‘30’,45:‘45’}
directory=‘/home/myuser/raster/’

tmp=‘/home/myuser/tmp/’
g.run_command(‘r.external.out’, directory=tmp, format=“GTiff”)
#MAPCALC

start_time = time.time()
mapcalc_list =
mapcalc = Module(“r.mapcalc”, overwrite=True, run_=False)
queue = ParallelModuleQueue(nprocs=8)
for dfix in df.index:
if 5<=dfix.hour<20:
input1 = ‘input1_’ + month[dfix.month] + ‘’ + hour[dfix.hour] + minute[dfix.minute]
input2 = 'input2
’ + month[dfix.month] + ‘’ + hour[dfix.hour] + minute[dfix.minute]
out = ’ " ’ + str(dfix.date()) + '
’ + str(dfix.time()) + ’ " ’
new_mapcalc = copy.deepcopy(mapcalc)
mapcalc_list.append(new_mapcalc)
m = new_mapcalc(expression=“%s = %s*%i+%s*%i”%(out,input1,df.ix[dfix,‘k1’],input2,df.ix[dfix,‘k2’]))
queue.put(m)
queue.wait()

print(“— %s seconds —” % (time.time() - start_time))

Best,

Lorenzo

Ciao Lorenzo,

On Wed, Mar 2, 2016 at 6:10 PM, Lorenzo Bottaccioli
<lorenzo.bottaccioli@gmail.com> wrote:

I have written this code from the example provided in
(https://grass.osgeo.org/grass70/manuals/libpython/pygrass.modules.interface.html?highlight=parallelmodulequeue#pygrass.modules.interface.module.ParallelModuleQueue)
but there is no difference in terms of time between setting nproces=1 or
nproces=8. How is possible? How can I achieve improvements?

Actually I can not reproduce your issue on my system I obtain:

--- 2.03981089592 seconds with 1 process ---
--- 1.10217881203 seconds with 4 process ---

here the code that I used:

{{{
import copy
import os
import time

import grass.script.setup as gsetup
from grass.pygrass.modules import Module, ParallelModuleQueue

def test_parallell(rinput, routput, ncalcs, nprocs=1):
    start_time = time.time()
    mapcalc_list =
    mapcalc = Module("r.mapcalc", overwrite=True, run_=False)
    queue = ParallelModuleQueue(nprocs)
    for n in range(ncalcs):
        new_mapcalc = copy.deepcopy(mapcalc)
        mapcalc_list.append(new_mapcalc)
        m = new_mapcalc(expression="{o}_{n} = {i} * {n}".format(i=rinput,
                                                                o=routput,
                                                                n=n))
        queue.put(m)
    queue.wait()
    print("--- %s seconds with %d process ---" % (time.time() - start_time,
                                                  nprocs))

gsetup.init(os.environ['GISBASE'],
            os.path.abspath("/home/pietro/docdat/gis"),
            'nc_basic_spm_grass7', 'user1')

test_parallell('elevation', 'tmptest', 16, nprocs=1)
test_parallell('elevation', 'tmptest', 16, nprocs=4)
Module('g.remove', type='raster', pattern='tmptest' + '*', flags='f')

}}}

Please next time try to use the north Carolina mapset, make your
problem simpler, remove all the unnecessary parts and reduce the code
to the essential.

Hope it helps.

Happy hacking!

Pietro