Hi
On 10/21/21 11:39 PM, B H wrote:
I am trying to parallelize some scripts that currently run only one way .( Currently scripts just create a new temporary mapset and run some commands using --exec option of grass78)
My understanding is that I should follow this, however I am unable to create a new mapset using this.
https://grasswiki.osgeo.org/wiki/GRASS_and_Shell#Automated_batch_jobs:_Setting_the_GRASS_environmental_variables
Here is what I have tried and failed
1) grass78 executable cannot be run parallely from bash even to just create the mapset
2) g.mapset -c
3) g.proj -c
If there is a different way to create a new mapset, please let me know (I am not sure if its as simple as copying some files from a template location...)
This seems to work for me, using GNU parallel.
I created a list of shapefiles:
micha@RMS:tmp$ head -3 list_of_shapefiles.txt
/home/micha/GIS/Israel/cellular_antennas.shp
/home/micha/GIS/Israel/cities.shp
/home/micha/GIS/Israel/contour_20m.shp
....
to use as input. Then I prepared a script to run grass on each, in it's own separate Location.
micha@RMS:tmp$ cat grass_process.sh
#!/bin/bash
if [ $# -eq 0 ]; then
echo "Input geospatial vector file is required."
echo "Syntax: grass_process.sh <input_vector>"
exit
fi
input=$1
output=`basename $input .shp`
# Prepare temporary name for Location (under the /tmp directory)
tmp_location=`mktemp -u`
# Create temporary GRASS location in that directory,
# using the shapefile as georeference, and run a command
grass78 -c $input ${tmp_location} --exec v.import input=$input output=$output --overwrite
sleep 10
The sleep at the end is just to give me time to check with ps -ef that all processes are running.
I made the script executable, of course.
Then here's the call to parallel:
micha@RMS:tmp$ parallel -j 10 ./grass_process.sh < list_of_shapefiles.txt
and after I fire it off, in a separate terminal I see:
micha@RMS:tmp$ ps -ef | grep grass
micha 45460 12434 0 10:04 pts/2 00:00:00 perl /usr/bin/parallel -j 10 ./grass_process.sh
micha 45481 45460 0 10:04 pts/2 00:00:00 /bin/bash ./grass_process.sh /home/micha/GIS/Israel/cellular_antennas.shp
micha 45483 45460 0 10:04 pts/2 00:00:00 /bin/bash ./grass_process.sh /home/micha/GIS/Israel/cities.shp
micha 45485 45460 0 10:04 pts/2 00:00:00 /bin/bash ./grass_process.sh /home/micha/GIS/Israel/contour_20m.shp
micha 45488 45460 0 10:04 pts/2 00:00:00 /bin/bash ./grass_process.sh /home/micha/GIS/Israel/contour_50m.shp
micha 45492 45460 0 10:04 pts/2 00:00:00 /bin/bash ./grass_process.sh /home/micha/GIS/Israel/il_seas.shp
micha 45496 45460 0 10:04 pts/2 00:00:00 /bin/bash ./grass_process.sh /home/micha/GIS/Israel/mideast_cities.shp
micha 45500 45460 0 10:04 pts/2 00:00:00 /bin/bash ./grass_process.sh /home/micha/GIS/Israel/reshut_nikuz.shp
micha 45504 45460 0 10:04 pts/2 00:00:00 /bin/bash ./grass_process.sh /home/micha/GIS/Israel/roads_ITM.shp
micha 45508 45460 0 10:04 pts/2 00:00:00 /bin/bash ./grass_process.sh /home/micha/GIS/Israel/roads_negev.shp
micha 45512 45460 0 10:04 pts/2 00:00:00 /bin/bash ./grass_process.sh /home/micha/GIS/Israel/roads.shp
all merrily going on together.
This will leave you with separate Locations/mapsets for each input file. I don't think you can get around that. GRASS itself is NOT parallelized, so you cannot run more than one GRASS process in the same mapset.
If you need to do a more complicated procedure, then wrap all the GRASS commands into another shell script and pass that script to the --exec parameter instead of individual commands.
HTH
#current grass scripts....
#Step 1: Create a new location with permanent mapset
grass78 -c /path/location/PERMANENT -e
#Step2: run some commands using--exec option
grass78--exec v.in.ogr input='$inputfile' output=$vecoutname location=path/location/PERMANENT
_______________________________________________
grass-user mailing list
grass-user@lists.osgeo.org
https://lists.osgeo.org/mailman/listinfo/grass-user
--
Micha Silver
Ben Gurion Univ.
Sde Boker, Remote Sensing Lab
cell: +972-523-665918