[GRASS-user] v.net parallelisation issues

I’ve encountered a bottleneck somewhere with v.net when scaling out with GNU Parallel… not sure if its an underlying issue with v.net or the way I’m calling the batch jobs?

I’ve got 32 CPUs and commensurate RAM. What I’m observing is v.net CPU utilisation dropping off in accordance with number of jobs running.

I’ve tried launching a single batch job with single mapset, as well as multiple batch jobs each with their own mapset (and database). I’ve tried both PG and sqlite backends. Same issue.

The script at the bottom describes the approach of launching multiple batch jobs each with their own map set. Executing a single batch job, and then launching parallel within the batch script is much cleaner code - but the results are no different.

I feel I’m so close, yet so far at such a critical stage of project delivery.

Hope someone can help

Kind regards
Mark

RESULTS

ONE JOB

TOTAL SCRIPT TIME: 70

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
31313 root 20 0 28876 4080 1284 S 76.5 0.0 0:20.25 sqlite
31293 root 20 0 276m 134m 8320 S 68.5 0.2 0:20.22 v.net.distance

—————————

TWO JOBS

TOTAL SCRIPT TIME: 96

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
21391 root 20 0 28876 4080 1284 R 53.0 0.0 0:01.90 sqlite
21392 root 20 0 28876 4080 1284 R 52.6 0.0 0:01.86 sqlite
21380 root 20 0 276m 128m 8320 R 49.3 0.2 0:04.02 v.net.distance
21381 root 20 0 276m 128m 8320 S 48.3 0.2 0:03.97 v.net.distance

—————————

FOUR JOBS

TOTAL SCRIPT TIME: 187

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
6953 mark 20 0 180m 100m 9520 S 63.6 0.2 1:47.39 x2goagent
23025 root 20 0 28876 4080 1284 S 21.5 0.0 0:02.03 sqlite
23026 root 20 0 28876 4080 1284 R 19.9 0.0 0:02.08 sqlite
23027 root 20 0 28876 4080 1284 S 19.5 0.0 0:01.87 sqlite
23028 root 20 0 28876 4080 1284 S 19.5 0.0 0:01.84 sqlite
23014 root 20 0 276m 128m 8320 R 18.5 0.2 0:04.06 v.net.distance
23012 root 20 0 276m 128m 8320 R 17.5 0.2 0:03.91 v.net.distance
23011 root 20 0 276m 128m 8320 S 16.9 0.2 0:04.13 v.net.distance
23015 root 20 0 276m 128m 8320 R 16.9 0.2 0:03.80 v.net.distance

—————————

EIGHT JOBS

TOTAL SCRIPT TIME: 373

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
27157 root 20 0 28876 4088 1284 S 19.5 0.0 0:42.39 sqlite
27162 root 20 0 28876 4088 1284 R 16.9 0.0 0:40.60 sqlite
6953 mark 20 0 181m 101m 9520 S 16.5 0.2 2:18.86 x2goagent
27154 root 20 0 28876 4088 1284 S 16.5 0.0 0:39.38 sqlite
27153 root 20 0 28876 4088 1284 S 16.2 0.0 0:35.60 sqlite
27156 root 20 0 28876 4088 1284 R 16.2 0.0 0:38.18 sqlite
27161 root 20 0 28876 4088 1284 S 15.9 0.0 0:40.96 sqlite
27155 root 20 0 28876 4088 1284 S 15.6 0.0 0:38.41 sqlite
27104 root 20 0 284m 139m 8332 S 14.9 0.2 0:39.94 v.net.distance
27158 root 20 0 28876 4088 1284 R 14.6 0.0 0:37.49 sqlite
27095 root 20 0 284m 138m 8332 S 14.2 0.2 0:34.48 v.net.distance
27099 root 20 0 284m 138m 8332 S 14.2 0.2 0:38.27 v.net.distance
27101 root 20 0 284m 139m 8332 R 14.2 0.2 0:38.80 v.net.distance
27105 root 20 0 284m 139m 8332 R 14.2 0.2 0:37.95 v.net.distance
27093 root 20 0 284m 138m 8332 R 13.9 0.2 0:32.64 v.net.distance
27102 root 20 0 284m 140m 8332 R 13.6 0.2 0:40.90 v.net.distance
27094 root 20 0 284m 138m 8332 R 13.2 0.2 0:35.78 v.net.distance

—————————

################################################
############ WORKER FUNCTION #############
################################################

CREATE MAPSETS AND BASH SCRIPTS FOR EACH CPU

fn_worker (){

#######################

copy mapset

#######################
cp -R /var/tmp/jtw/PERMANENT /var/tmp/jtw/batch_“$1”

#######################

generate batch_job file

#######################
echo -e ‘#!/bin/bash
dbsettings=“/mnt/data/common/repos/cf_private/settings/current.sh”
source $dbsettings
cpu=’$1’

jid=psql -d $dbname -U $username -A -t -c "SELECT min(jid) FROM jtw.nsw_tz_joblist WHERE processed = false and cpu = '$1';"
o_tz11=psql -d $dbname -U $username -A -t -c "SELECT o_tz11 FROM jtw.nsw_tz_joblist WHERE jid = $jid;"
o_cat=psql -d $dbname -U $username -A -t -c "SELECT o_tz11 FROM jtw.nsw_tz_joblist WHERE jid = $jid;"
d_cat=psql -d $dbname -U $username -A -t -c "SELECT d_tz11 FROM jtw.nsw_tz_joblist WHERE jid = $jid;"
layername=“temp_”$jid

v.net.distance --overwrite in=nsw_road_network_final_connected@batch_‘$1’ out=$layername from_layer=2 to_layer=2 from_cats=$d_cat to_cats=$o_cat arc_column=fwdcost arc_backward_column=bwdcost

v.out.ogr --o input=$layername output=/var/tmp/$layername type=line

ogr2ogr -overwrite -f “PostgreSQL” PG:“host=localhost dbname=o$dbname user=$username password=$password” /var/tmp/$layername/$layername.shp -nln jtw.$layername -s_srs EPSG:3577 -t_srs EPSG:3577 -a_srs EPSG:3577 -nlt LINESTRING

psql -d $dbname -U $username -c “INSERT INTO jtw.nsw_tz_journey_paths
With s AS (SELECT a.cat, a.tcat, b.tz_code11 as o_tz11, c.tz_code11 as d_tz11, d.lid, d.wkb_geometry, e.employed_persons FROM jtw.$layername a, grass.nsw_tz_centroids_nodes b, grass.nsw_tz_centroids_nodes c, jtw.nsw_road_network_final_net_att d, jtw.nsw_tz_volumes e WHERE a.tcat = b.cat AND a.cat = c.cat AND ST_Equals(a.wkb_geometry, d.wkb_geometry) AND d.type <> ‘'service_line'’ AND b.tz_code11 = e.o_tz11 AND c.tz_code11 = e.d_tz11 AND e.mode9 = 4) SELECT NEXTVAL(‘'jtw.nsw_tz_journey_paths_jid_seq'’), o_tz11, d_tz11, lid, wkb_geometry, employed_persons FROM s; UPDATE jtw.nsw_tz_joblist SET processed = true WHERE jid = $jid;”

#end of job file’ > /var/tmp/jtw/jobs/batch_$1.sh
#######################

chmod u+x /var/tmp/jtw/jobs/batch_$1.sh
}
export -f fn_worker

remove previous mapsets before writing new files

rm -rf /var/tmp/jtw/batch*
rm -rf /var/tmp/jtw/jobs/batch*

#execute in parallel
seq 1 4 | parallel fn_worker {1}
wait
#######################

################################################
####### JOB SCHEDULER ########
################################################

#\\\\\\\\\\\\
START_T1=$(date +%s)
#\\\\\\\\\\\\\

fn_worker (){
export GRASS_BATCH_JOB=/var/tmp/jtw/jobs/batch_$1.sh
grass70 /var/tmp/jtw/batch_$1
unset GRASS_BATCH_JOB
}
export -f fn_worker

seq 1 4 | parallel fn_worker {1}
wait

#\\\\\\\\\\\\
END_T1=$(date +%s)
#\\\\\\\\\\\\
TOTAL_DIFF=$(( $END_T1 - $START_T1 ))
echo “TOTAL SCRIPT TIME: $TOTAL_DIFF”
#\\\\\\\\\\\\\

################################################

The slow rate of writing out the v.net.allpair results from
PostgreSQL was due to the sheer volume of line strings, as the number
of pairs increased (n^2). Simple math said stop. I?ve since
changed my approach and am using v.net.distance in a novel way where
the to_cat is the origin, and the from_cat is a string of
destinations - this is an equivalent way of generating multiple
v.net.paths in a single operation. Moreover, I?m feeding each origin

  • destination collection into GNU Parallel as a separate job, so it
    rips through the data at scale!

Hi Mark,

Don`t know if that is of any help, but:

Have you tried the igraph package for very customized / sophisticated network analysis (http://igraph.org/redirect.html)?

It plays nicely with R, python, and C and therefor also with GRASS.

What I did (now in several cases) is to use v.net and v.db.select / v.to.db (from within R) in order to collect attributes of nodes and edges into R objects (python arrays would work equally well I guess) and then continued working without the geometries. After all operations are done, results are written back to GRASS / SQLite. For parallelization I used doMC package in R and I was quite satisfied with the performance.

Kind regards,

Stefan

P.S.: I also used that kind of approach in the r.connectivity.network addon (GRASS 6).

···

From: grass-user-bounces@lists.osgeo.org [mailto:grass-user-bounces@lists.osgeo.org] On Behalf Of Mark Wynter
Sent: 13. februar 2015 08:40
To: grass-user@lists.osgeo.org
Subject: [GRASS-user] v.net parallelisation issues

I’ve encountered a bottleneck somewhere with v.net when scaling out with GNU Parallel… not sure if its an underlying issue with v.net or the way I’m calling the batch jobs?

I’ve got 32 CPUs and commensurate RAM. What I’m observing is v.net CPU utilisation dropping off in accordance with number of jobs running.

I’ve tried launching a single batch job with single mapset, as well as multiple batch jobs each with their own mapset (and database). I’ve tried both PG and sqlite backends. Same issue.

The script at the bottom describes the approach of launching multiple batch jobs each with their own map set. Executing a single batch job, and then launching parallel within the batch script is much cleaner code - but the results are no different.

I feel I’m so close, yet so far at such a critical stage of project delivery.

Hope someone can help

Kind regards

Mark

RESULTS

ONE JOB

TOTAL SCRIPT TIME: 70

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND

31313 root 20 0 28876 4080 1284 S 76.5 0.0 0:20.25 sqlite

31293 root 20 0 276m 134m 8320 S 68.5 0.2 0:20.22 v.net.distance

—————————

TWO JOBS

TOTAL SCRIPT TIME: 96

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND

21391 root 20 0 28876 4080 1284 R 53.0 0.0 0:01.90 sqlite

21392 root 20 0 28876 4080 1284 R 52.6 0.0 0:01.86 sqlite

21380 root 20 0 276m 128m 8320 R 49.3 0.2 0:04.02 v.net.distance

21381 root 20 0 276m 128m 8320 S 48.3 0.2 0:03.97 v.net.distance

—————————

FOUR JOBS

TOTAL SCRIPT TIME: 187

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND

6953 mark 20 0 180m 100m 9520 S 63.6 0.2 1:47.39 x2goagent

23025 root 20 0 28876 4080 1284 S 21.5 0.0 0:02.03 sqlite

23026 root 20 0 28876 4080 1284 R 19.9 0.0 0:02.08 sqlite

23027 root 20 0 28876 4080 1284 S 19.5 0.0 0:01.87 sqlite

23028 root 20 0 28876 4080 1284 S 19.5 0.0 0:01.84 sqlite

23014 root 20 0 276m 128m 8320 R 18.5 0.2 0:04.06 v.net.distance

23012 root 20 0 276m 128m 8320 R 17.5 0.2 0:03.91 v.net.distance

23011 root 20 0 276m 128m 8320 S 16.9 0.2 0:04.13 v.net.distance

23015 root 20 0 276m 128m 8320 R 16.9 0.2 0:03.80 v.net.distance

—————————

EIGHT JOBS

TOTAL SCRIPT TIME: 373

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND

27157 root 20 0 28876 4088 1284 S 19.5 0.0 0:42.39 sqlite

27162 root 20 0 28876 4088 1284 R 16.9 0.0 0:40.60 sqlite

6953 mark 20 0 181m 101m 9520 S 16.5 0.2 2:18.86 x2goagent

27154 root 20 0 28876 4088 1284 S 16.5 0.0 0:39.38 sqlite

27153 root 20 0 28876 4088 1284 S 16.2 0.0 0:35.60 sqlite

27156 root 20 0 28876 4088 1284 R 16.2 0.0 0:38.18 sqlite

27161 root 20 0 28876 4088 1284 S 15.9 0.0 0:40.96 sqlite

27155 root 20 0 28876 4088 1284 S 15.6 0.0 0:38.41 sqlite

27104 root 20 0 284m 139m 8332 S 14.9 0.2 0:39.94 v.net.distance

27158 root 20 0 28876 4088 1284 R 14.6 0.0 0:37.49 sqlite

27095 root 20 0 284m 138m 8332 S 14.2 0.2 0:34.48 v.net.distance

27099 root 20 0 284m 138m 8332 S 14.2 0.2 0:38.27 v.net.distance

27101 root 20 0 284m 139m 8332 R 14.2 0.2 0:38.80 v.net.distance

27105 root 20 0 284m 139m 8332 R 14.2 0.2 0:37.95 v.net.distance

27093 root 20 0 284m 138m 8332 R 13.9 0.2 0:32.64 v.net.distance

27102 root 20 0 284m 140m 8332 R 13.6 0.2 0:40.90 v.net.distance

27094 root 20 0 284m 138m 8332 R 13.2 0.2 0:35.78 v.net.distance

—————————

################################################

############ WORKER FUNCTION #############

################################################

CREATE MAPSETS AND BASH SCRIPTS FOR EACH CPU

fn_worker (){

#######################

copy mapset

#######################

cp -R /var/tmp/jtw/PERMANENT /var/tmp/jtw/batch_“$1”

#######################

generate batch_job file

#######################

echo -e '#!/bin/bash

dbsettings=“/mnt/data/common/repos/cf_private/settings/current.sh”

source $dbsettings

cpu=‘$1’

jid=psql -d $dbname -U $username -A -t -c "SELECT min(jid) FROM jtw.nsw_tz_joblist WHERE processed = false and cpu = '$1';"

o_tz11=psql -d $dbname -U $username -A -t -c "SELECT o_tz11 FROM jtw.nsw_tz_joblist WHERE jid = $jid;"

o_cat=psql -d $dbname -U $username -A -t -c "SELECT o_tz11 FROM jtw.nsw_tz_joblist WHERE jid = $jid;"

d_cat=psql -d $dbname -U $username -A -t -c "SELECT d_tz11 FROM jtw.nsw_tz_joblist WHERE jid = $jid;"

layername=“temp_”$jid

v.net.distance --overwrite in=nsw_road_network_final_connected@batch_‘$1’ out=$layername from_layer=2 to_layer=2 from_cats=$d_cat to_cats=$o_cat arc_column=fwdcost arc_backward_column=bwdcost

v.out.ogr --o input=$layername output=/var/tmp/$layername type=line

ogr2ogr -overwrite -f “PostgreSQL” PG:“host=localhost dbname=o$dbname user=$username password=$password” /var/tmp/$layername/$layername.shp -nln jtw.$layername -s_srs EPSG:3577 -t_srs EPSG:3577 -a_srs EPSG:3577 -nlt LINESTRING

psql -d $dbname -U $username -c "INSERT INTO jtw.nsw_tz_journey_paths

With s AS (SELECT a.cat, a.tcat, b.tz_code11 as o_tz11, c.tz_code11 as d_tz11, d.lid, d.wkb_geometry, e.employed_persons FROM jtw.$layername a, grass.nsw_tz_centroids_nodes b, grass.nsw_tz_centroids_nodes c, jtw.nsw_road_network_final_net_att d, jtw.nsw_tz_volumes e WHERE a.tcat = b.cat AND a.cat = c.cat AND ST_Equals(a.wkb_geometry, d.wkb_geometry) AND d.type <> ‘'service_line'’ AND b.tz_code11 = e.o_tz11 AND c.tz_code11 = e.d_tz11 AND e.mode9 = 4) SELECT NEXTVAL(‘'jtw.nsw_tz_journey_paths_jid_seq'’), o_tz11, d_tz11, lid, wkb_geometry, employed_persons FROM s; UPDATE jtw.nsw_tz_joblist SET processed = true WHERE jid = $jid;"

#end of job file’ > /var/tmp/jtw/jobs/batch_$1.sh

#######################

chmod u+x /var/tmp/jtw/jobs/batch_$1.sh

}

export -f fn_worker

remove previous mapsets before writing new files

rm -rf /var/tmp/jtw/batch*

rm -rf /var/tmp/jtw/jobs/batch*

#execute in parallel

seq 1 4 | parallel fn_worker {1}

wait

#######################

################################################

####### JOB SCHEDULER ########

################################################

#\\\\\\\\\\\\\

START_T1=$(date +%s)

#\\\\\\\\\\\\\

fn_worker (){

export GRASS_BATCH_JOB=/var/tmp/jtw/jobs/batch_$1.sh

grass70 /var/tmp/jtw/batch_$1

unset GRASS_BATCH_JOB

}

export -f fn_worker

seq 1 4 | parallel fn_worker {1}

wait

#\\\\\\\\\\\\\

END_T1=$(date +%s)

#\\\\\\\\\\\\\

TOTAL_DIFF=$(( $END_T1 - $START_T1 ))

echo “TOTAL SCRIPT TIME: $TOTAL_DIFF”

#\\\\\\\\\\\\\

################################################

The slow rate of writing out the v.net.allpair results from
PostgreSQL was due to the sheer volume of line strings, as the number
of pairs increased (n^2). Simple math said stop. I?ve since
changed my approach and am using v.net.distance in a novel way where
the to_cat is the origin, and the from_cat is a string of
destinations - this is an equivalent way of generating multiple
v.net.paths in a single operation. Moreover, I?m feeding each origin

  • destination collection into GNU Parallel as a separate job, so it
    rips through the data at scale!

On 13/02/15 08:39, Mark Wynter wrote:

I’ve encountered a bottleneck somewhere with v.net <http://v.net> when
scaling out with GNU Parallel… not sure if its an underlying issue with
v.net <http://v.net> or the way I’m calling the batch jobs?

I’ve got 32 CPUs and commensurate RAM. What I’m observing is v.net
<http://v.net> CPU utilisation dropping off in accordance with number of
jobs running.

And this means that you don't get any gain in duration ? Could it be that as you divide into more batches each batch is smaller and thus needs less CPU ?

Moritz

Hi Moritz

With the second approach (the code I shared in my post), I have 3500 discrete jobs, and I set the number of batches equal to the number of CPUs. Each batch job is despatched to a cpu, where it then pulls from a queue of job id’s that are processed in serial within each batch job. The thinking behind this approach was to allocate jobs across available CPUs as separate batch processes.

The other and preferred approach is to launch 1 batch job, and then GNU parallel draws down from the list of 3500 jobs, assigning jobs to worker functions as CPUs become available. This code pattern I’ve had much success with parallelising PostGIS queries etc.

As you have suspected, I get no benefit from additional CPUs.

Unfortunately I don’t have time on my side, and parallelisation is critical. A fallback is to spin up a cluster of 16 x 2 CPU machines and pre-allocate job-ids to machines, and then write the results back to the master node - but this is not ideal and pathway I am reticent about going down.

Do you know anyone who may have attempted to parallelise v.net?

I guess the most important question right now is - is it possible to do poor man’s parallelisation with v.net? Anyone?

Mark

On 13 Feb 2015, at 7:56 pm, Moritz Lennert <mlennert@club.worldonline.be> wrote:

On 13/02/15 08:39, Mark Wynter wrote:

I’ve encountered a bottleneck somewhere with v.net <http://v.net> when
scaling out with GNU Parallel… not sure if its an underlying issue with
v.net <http://v.net> or the way I’m calling the batch jobs?

I’ve got 32 CPUs and commensurate RAM. What I’m observing is v.net
<http://v.net> CPU utilisation dropping off in accordance with number of
jobs running.

And this means that you don't get any gain in duration ? Could it be that as you divide into more batches each batch is smaller and thus needs less CPU ?

Moritz

On 13/02/15 13:40, Mark Wynter wrote:

Hi Moritz

With the second approach (the code I shared in my post), I have 3500 discrete jobs, and I set the number of batches equal to the number of CPUs. Each batch job is despatched to a cpu, where it then pulls from a queue of job id’s that are processed in serial within each batch job. The thinking behind this approach was to allocate jobs across available CPUs as separate batch processes.

The other and preferred approach is to launch 1 batch job, and then GNU parallel draws down from the list of 3500 jobs, assigning jobs to worker functions as CPUs become available. This code pattern I’ve had much success with parallelising PostGIS queries etc.

As you have suspected, I get no benefit from additional CPUs.

Are you sure the problem is CPU-bound ?

Unfortunately I don’t have time on my side, and parallelisation is critical. A fallback is to spin up a cluster of 16 x 2 CPU machines and pre-allocate job-ids to machines, and then write the results back to the master node - but this is not ideal and pathway I am reticent about going down.

Do you know anyone who may have attempted to parallelise v.net?

No. Personally I don't have any experience with this.
You are specifically speaking about v.net.distance, here, or ?

I guess the most important question right now is - is it possible to do poor man’s parallelisation with v.net? Anyone?

The one who knows the insides of these modules best is Markus Metz.

Moritz