[GRASS-user] 64 bit or parallel processing

Dear Grass users,

Has anyone experimented using 64 bit machines or some kind of parallel processing for some GRASS modules? Is it a possible thing? Does GRAS have any modules prepared for such conditions? I am not a superexpert, I was wondering if anyone can give me clues on how can operations be made faster using special hardware... for example is I can significantly speed up spline interpolation using parallel machines, or clusters (I am shooting in the dark, since I am not even sure what clusters are )

Thanks guys,

Francesco Pirotti

The inherent nature of raster data makes it very easy to implement most
raster commands in a parallel processing environment. Back in the early
1990's, researchers at Oak Ridge National Laboratory wanted to use some of
my research in spatial data uncertainty modeling. In less than two hours, an
ORNL programmer and I identified the functions that could be multi-threaded.
The programmer was experienced with the parallel processing C modules ORNL
used in its super computer. By the next day, the programmer had my programs
working on the super computer.

A quick google search found OdinMP, http://fenixforge.com/projects/odinmp,
as a possible free open source parallel processer using the OpenMP
future-standard.

Chuck Ehlschlaeger, Associate Professor & GIS Center Director
Department of Geography, Western Illinois University
215 Tillman Hall, 1 University Circle, Macomb, IL 61455
cre111@wiu.edu, phone: 309-298-1841, fax: 309-298-3003

-----Original Message-----
From: grassuser-bounces@grass.itc.it [mailto:grassuser-bounces@grass.itc.it]
On Behalf Of francesco.pirotti
Sent: Tuesday, February 27, 2007 1:38 AM
To: grassuser@grass.itc.it
Subject: [GRASS-user] 64 bit or parallel processing

Dear Grass users,

Has anyone experimented using 64 bit machines or some kind of parallel
processing for some GRASS modules? Is it a possible thing? Does GRAS have
any modules prepared for such conditions? I am not a superexpert, I was
wondering if anyone can give me clues on how can operations be made faster
using special hardware... for example is I can significantly speed up
spline interpolation using parallel machines, or clusters (I am shooting in
the dark, since I am not even sure what clusters are )

Thanks guys,

Francesco Pirotti

_______________________________________________
grassuser mailing list
grassuser@grass.itc.it
http://grass.itc.it/mailman/listinfo/grassuser

Here are some references of the guys working on Distributed GRASS
modules using open source parallel computing.

r.vi creates a given Vegetation index from a list of 13 of them, most
of them only
requiring Red and NIR. Updated to accept all types of input data.
Authors: Baburao Kamble and Yann Chemin.
Shamim Akhter extended this module for mpi verion for cluster.
- r.vi.mpi is the mpi verion for cluster GRASS GIS education (no speed
up here!).
Author: Shamim Akhter

sincerely
baburao

On 27/02/07, Charles Ehlschlaeger <c.ehlschlaeger@insightbb.com> wrote:

The inherent nature of raster data makes it very easy to implement most
raster commands in a parallel processing environment. Back in the early
1990's, researchers at Oak Ridge National Laboratory wanted to use some of
my research in spatial data uncertainty modeling. In less than two hours, an
ORNL programmer and I identified the functions that could be multi-threaded.
The programmer was experienced with the parallel processing C modules ORNL
used in its super computer. By the next day, the programmer had my programs
working on the super computer.

A quick google search found OdinMP, http://fenixforge.com/projects/odinmp,
as a possible free open source parallel processer using the OpenMP
future-standard.

Chuck Ehlschlaeger, Associate Professor & GIS Center Director
Department of Geography, Western Illinois University
215 Tillman Hall, 1 University Circle, Macomb, IL 61455
cre111@wiu.edu, phone: 309-298-1841, fax: 309-298-3003

-----Original Message-----
From: grassuser-bounces@grass.itc.it [mailto:grassuser-bounces@grass.itc.it]
On Behalf Of francesco.pirotti
Sent: Tuesday, February 27, 2007 1:38 AM
To: grassuser@grass.itc.it
Subject: [GRASS-user] 64 bit or parallel processing

Dear Grass users,

Has anyone experimented using 64 bit machines or some kind of parallel
processing for some GRASS modules? Is it a possible thing? Does GRAS have
any modules prepared for such conditions? I am not a superexpert, I was
wondering if anyone can give me clues on how can operations be made faster
using special hardware... for example is I can significantly speed up
spline interpolation using parallel machines, or clusters (I am shooting in
the dark, since I am not even sure what clusters are )

Thanks guys,

Francesco Pirotti

_______________________________________________
grassuser mailing list
grassuser@grass.itc.it
http://grass.itc.it/mailman/listinfo/grassuser

_______________________________________________
grassuser mailing list
grassuser@grass.itc.it
http://grass.itc.it/mailman/listinfo/grassuser

--
Mr. Baburao Dashrath Kamble
Masters of Engineering,
Remote Sensing and Geographic Information Systems,
School of Engineering and Technology,
Asian Institute of Technology,
Pathumthani 12120, Thailand
Phone No 66-2-524-7416
webpage: http://baburaokamble.googlepages.com/

francesco.pirotti wrote:

Has anyone experimented using 64 bit machines or some kind of parallel
processing for some GRASS modules? Is it a possible thing? Does GRAS
have any modules prepared for such conditions? I am not a
superexpert, I was wondering if anyone can give me clues on how can
operations be made faster using special hardware... for example is I
can significantly speed up spline interpolation using parallel
machines, or clusters

(I'm not a superexpert with this either, so maybe I am slightly wrong in
places)

for GRASS 5 there was a parallelized s.surf.idw (MPI), see
  http://grass.ibiblio.org/download/addons.php

It may be interesting to try and parallelize the segmentation library,
but most of GRASS's raster code needs to be rewritten.

two problems:
1) most grass raster modules are serial row based. maybe if the raster
format gets updated to a tiled model it would be possible to start
parallizing it. (not impossible, NULL and FP support was added in the past)
or break up lines into chunks like raster modules that have rows= or
percent= options already.

2) the raster file format is split over may directories making it hard
to "lock" a map so another job doesn't try and edit the same map. the
future plan is to have the raster format stored like the vector format,
$MAPSET/raster/$MAPNAME/element, edited as a copy and moved into position
in full when the raster is closed.

3) (bonus problem) we need to ensure that the region (WIND) file isn't
modified by any module but g.region, and that modules read the WIND file
only one when they first start. (in case it changes mid-process)

(I am shooting in the dark, since I am not even sure what clusters
are)

two basic flavours of clusters, call them beowulf and mosix.

* Beowulf-style splits jobs up itself. code needs to be treaded (hard).

* Mosix-style doesn't split jobs, but it can farm jobs out to other
processors. You can use your existing code without thread-proofing it.
(but not GRASS, or at least not two jobs within the same mapset etc)

see
  http://www.beowulf.org
  http://openmosix.sourceforge.net
and the wikipedia pages for those.

Hamish

Hello Hamish
On Wed, 28 Feb 2007, Hamish wrote:

(I'm not a superexpert with this either, so maybe I am slightly wrong in
places)

for GRASS 5 there was a parallelized s.surf.idw (MPI), see
http://grass.ibiblio.org/download/addons.php

Yes although IMHO it is not much use because it was a bad approach to improving efficiency - using a more efficient algorithm would have been better than parallelising an inefficient algorithm. See my message to Yann Chemin here:
http://grass.itc.it/pipermail/grass-dev/2007-January/028617.html
AFAIK no one has ever compared s.surf.idw.mpi with the improved s.surf.idw.

Paul

Hamish wrote:

It may be interesting to try and parallelize the segmentation library,
but most of GRASS's raster code needs to be rewritten.

two problems:
1) most grass raster modules are serial row based. maybe if the raster
format gets updated to a tiled model it would be possible to start
parallizing it. (not impossible, NULL and FP support was added in the past)
or break up lines into chunks like raster modules that have rows= or
percent= options already.

Rows versus tiles doesn't have any impact upon the ability to
parallelise the code.

2) the raster file format is split over may directories making it hard
to "lock" a map so another job doesn't try and edit the same map. the
future plan is to have the raster format stored like the vector format,
$MAPSET/raster/$MAPNAME/element, edited as a copy and moved into position
in full when the raster is closed.

3) (bonus problem) we need to ensure that the region (WIND) file isn't
modified by any module but g.region, and that modules read the WIND file
only one when they first start. (in case it changes mid-process)

This isn't an issue for parallelising individual modules (i.e. using
multiple threads in order to utilise multiple CPU cores). It is an
issue for being able to run multiple modules concurrently.

The core problem for parallelising GRASS is that the libraries aren't
remotely thread-safe, and the way that they are written is such that
making them thread-safe would be a lot of work.

The two biggest issues are:

1. Use of global/static variables; libgis alone has ~180 of them (not
including the 181 GRASS_copyright variables).

2. Use of "scratch" buffers. Current policy deprecates the use of
alloca(), so eliminating scratch buffers would involve lots of
malloc/free calls for short-lived buffers, which is inefficient.

Consequently, it isn't feasible to make the bulk of GRASS thread-safe.
We might be able to do this for limited portions, e.g. the core raster
I/O code (get/put row operations), but you would still need to ensure
that everything else was called from the main thread.

It would be more feasible to modify individual modules; e.g. a
multi-threaded version of r.mapcalc might be feasible. But unless the
raster I/O code was made thread-safe, you would still be limited by
the rate at which map data could be read and written by a single
thread, so it would only be worthwhile for cases where the bulk of the
overhead was in the actual calculations.

--
Glynn Clements <glynn@gclements.plus.com>

Glynn Clements wrote:

Hamish wrote:

It may be interesting to try and parallelize the segmentation library,
but most of GRASS's raster code needs to be rewritten.

two problems:
1) most grass raster modules are serial row based. maybe if the raster
format gets updated to a tiled model it would be possible to start
parallizing it. (not impossible, NULL and FP support was added in the past)
or break up lines into chunks like raster modules that have rows= or
percent= options already.

Rows versus tiles doesn't have any impact upon the ability to
parallelise the code.

It will be easier with a tile based approach to parallelize the code.
The g3d lib implementation allows different tile sizes. So you are able
to define tile sizes (befor open a map) which will fit perfectly into
the memory. Those tiles can be processed in parallel.
I guess a similar approach can be implemented using the segment library for raster maps.

I hope to get a prototype running using tile based parallelizing approach in 5 - 6 months. It will be part of the new gpde library and will use the gpde array implementation for easy data access and because its thread safe (AFAIKT).
The gpde lib uses OpenMP www.openmp.org for parallel computation in grass. But only the linear equation solver and the linear equation system assembler are parallelized. This will hopefully change in the future.

To enable parallel data access will be a hard work in grass and will only work on cluster file systems. One approach will be to distribute the rows and tiles to different storage places to enable parallel data access.

Just my two cent
Soeren

2) the raster file format is split over may directories making it hard
to "lock" a map so another job doesn't try and edit the same map. the
future plan is to have the raster format stored like the vector format,
$MAPSET/raster/$MAPNAME/element, edited as a copy and moved into position
in full when the raster is closed.

3) (bonus problem) we need to ensure that the region (WIND) file isn't
modified by any module but g.region, and that modules read the WIND file
only one when they first start. (in case it changes mid-process)

This isn't an issue for parallelising individual modules (i.e. using
multiple threads in order to utilise multiple CPU cores). It is an
issue for being able to run multiple modules concurrently.

The core problem for parallelising GRASS is that the libraries aren't
remotely thread-safe, and the way that they are written is such that
making them thread-safe would be a lot of work.

The two biggest issues are:

1. Use of global/static variables; libgis alone has ~180 of them (not
including the 181 GRASS_copyright variables).

2. Use of "scratch" buffers. Current policy deprecates the use of
alloca(), so eliminating scratch buffers would involve lots of
malloc/free calls for short-lived buffers, which is inefficient.

Consequently, it isn't feasible to make the bulk of GRASS thread-safe. We might be able to do this for limited portions, e.g. the core raster
I/O code (get/put row operations), but you would still need to ensure
that everything else was called from the main thread.

It would be more feasible to modify individual modules; e.g. a
multi-threaded version of r.mapcalc might be feasible. But unless the
raster I/O code was made thread-safe, you would still be limited by
the rate at which map data could be read and written by a single
thread, so it would only be worthwhile for cases where the bulk of the
overhead was in the actual calculations.

Glynn Clements wrote:

It would be more feasible to modify individual modules; e.g. a
multi-threaded version of r.mapcalc might be feasible. But unless the
raster I/O code was made thread-safe, you would still be limited by
the rate at which map data could be read and written by a single
thread, so it would only be worthwhile for cases where the bulk of the
overhead was in the actual calculations.

Right, so working on individual modules which can typically take 10+
hours to run (*.rst, watersheds, viewsheds (incl. r.sun), indices from
RS data, etc) seems like a much better focus of effort for the greatest
gain:work ratio. And as seen in the MPI add-ons this is exactly where
the previous ad-hoc effort has gone.

IIRC Helena said that the RST modules have changed substantially sice
GRASS 5 and the s.surf.rst.mpi work would have to be largely redone.

Any thoughts on gains that could be made by MPI'ing the segmentation
lib? Do do modules using that usually do so for memory needs not
processing speed?

Hamish

Most GRASS raster functions, r.mapcalc and similar, won't speed up by more
than a couple of percent from parallelization. 90% of the time spent by
these commands comes from moving the data off the hard drive and getting it
to the CPU and visa versa (based on parallel processing research I did in
the mid '90s). For "normal" GRASS raster commands, having four hard drives
on RAID 0 will dramatically increase GRASS performance more than anything
one thing you could do to a computer (or GRASS itself). The reason ORNL
parallelized my code was that the computational requirements was over c^2,
where c is the number of grid cells on a map, for some of the analyses they
wanted to do.

As Hamish wrote in recent email, good candidates for parallelization would
be viewshed analysis or other functions requiring many cells to be read to
get a value for each output grid cell. r.watershed won't benefit from
parallelization.

Chuck Ehlschlaeger, Associate Professor & GIS Center Director
Department of Geography, Western Illinois University
215 Tillman Hall, 1 University Circle, Macomb, IL 61455
cre111@wiu.edu, phone: 309-298-1841, fax: 309-298-3003

-----Original Message-----
From: grassuser-bounces@grass.itc.it [mailto:grassuser-bounces@grass.itc.it]
On Behalf Of Glynn Clements
Sent: Wednesday, February 28, 2007 3:33 PM
To: Hamish
Cc: grassuser@grass.itc.it
Subject: Re: [GRASS-user] 64 bit or parallel processing

Hamish wrote:

It may be interesting to try and parallelize the segmentation library,
but most of GRASS's raster code needs to be rewritten.

two problems:
1) most grass raster modules are serial row based. maybe if the raster
format gets updated to a tiled model it would be possible to start
parallizing it. (not impossible, NULL and FP support was added in the

past)

or break up lines into chunks like raster modules that have rows= or
percent= options already.

Rows versus tiles doesn't have any impact upon the ability to
parallelise the code.

2) the raster file format is split over may directories making it hard
to "lock" a map so another job doesn't try and edit the same map. the
future plan is to have the raster format stored like the vector format,
$MAPSET/raster/$MAPNAME/element, edited as a copy and moved into position
in full when the raster is closed.

3) (bonus problem) we need to ensure that the region (WIND) file isn't
modified by any module but g.region, and that modules read the WIND file
only one when they first start. (in case it changes mid-process)

This isn't an issue for parallelising individual modules (i.e. using
multiple threads in order to utilise multiple CPU cores). It is an
issue for being able to run multiple modules concurrently.

The core problem for parallelising GRASS is that the libraries aren't
remotely thread-safe, and the way that they are written is such that
making them thread-safe would be a lot of work.

The two biggest issues are:

1. Use of global/static variables; libgis alone has ~180 of them (not
including the 181 GRASS_copyright variables).

2. Use of "scratch" buffers. Current policy deprecates the use of
alloca(), so eliminating scratch buffers would involve lots of
malloc/free calls for short-lived buffers, which is inefficient.

Consequently, it isn't feasible to make the bulk of GRASS thread-safe.
We might be able to do this for limited portions, e.g. the core raster
I/O code (get/put row operations), but you would still need to ensure
that everything else was called from the main thread.

It would be more feasible to modify individual modules; e.g. a
multi-threaded version of r.mapcalc might be feasible. But unless the
raster I/O code was made thread-safe, you would still be limited by
the rate at which map data could be read and written by a single
thread, so it would only be worthwhile for cases where the bulk of the
overhead was in the actual calculations.

--
Glynn Clements <glynn@gclements.plus.com>

_______________________________________________
grassuser mailing list
grassuser@grass.itc.it
http://grass.itc.it/mailman/listinfo/grassuser

I'm now working at UCD on a project with > 65 (!) hyperspectral images, and
a LOT of the slow downs stem from I/O intensive processes -- with that said,
there are plenty of "standard" raster processing routines that start
becoming processor intensive (e.g. linear spectral unmixing) that, with a
decent i/o setup (we have a lot of cheap-o 2-drive RAID0s that we back up to
a permanent location, this nearly doubles our I/O speed). Not having any
idea of how grass's raster access backbone functions, it does seem you could
have a generic tiling routine that could be called from all of the raster
commands that would be highly parallizable (I note that RSI ENVI does this
out of the box now, it detects the # of processors and simply starts forking
a single image across all of the processors, I assume one line/tile per
processor). Mapcalc and related processes would be a good start, and keep
in mind *spatial* analyses (e.g. texture windows) as you design it.

My suggestion to the GRASS programmers: if you decide this is useful, I'd
HIGHLY recommend trying to do this in an out-of-the-box approach, e.g.
having an install-time "mp" capability -- if it requires being a) experts in
raster processing, b) experts in GRASS and c) experts in parallel coding,
you'll end up with 3 people doing SMP :slight_smile:

--j

-----Original Message-----
From: grassuser-bounces@grass.itc.it [mailto:grassuser-bounces@grass.itc.it]
On Behalf Of Hamish
Sent: Wednesday, February 28, 2007 4:14 PM
To: Glynn Clements
Cc: grassuser@grass.itc.it
Subject: Re: [GRASS-user] 64 bit or parallel processing

Glynn Clements wrote:

It would be more feasible to modify individual modules; e.g. a
multi-threaded version of r.mapcalc might be feasible. But unless the
raster I/O code was made thread-safe, you would still be limited by
the rate at which map data could be read and written by a single
thread, so it would only be worthwhile for cases where the bulk of the
overhead was in the actual calculations.

Right, so working on individual modules which can typically take 10+
hours to run (*.rst, watersheds, viewsheds (incl. r.sun), indices from
RS data, etc) seems like a much better focus of effort for the greatest
gain:work ratio. And as seen in the MPI add-ons this is exactly where
the previous ad-hoc effort has gone.

IIRC Helena said that the RST modules have changed substantially sice
GRASS 5 and the s.surf.rst.mpi work would have to be largely redone.

Any thoughts on gains that could be made by MPI'ing the segmentation
lib? Do do modules using that usually do so for memory needs not
processing speed?

Hamish

_______________________________________________
grassuser mailing list
grassuser@grass.itc.it
http://grass.itc.it/mailman/listinfo/grassuser

Hamish wrote:

Any thoughts on gains that could be made by MPI'ing the segmentation
lib? Do do modules using that usually do so for memory needs not
processing speed?

Modules which use the segment library normally do so because they
access the raster data in an order other than top-to-bottom.

If an algorithm reads the data in a linear fashion, there's no point
in using the segment library.

Ultimately, we need to look into replacing the core raster I/O code.
However, this needs to be done incrementally; we can't afford to put
everything on hold while we re-write everything which uses
G_{get,put}_raster_row() etc (i.e. most of GRASS).

--
Glynn Clements <glynn@gclements.plus.com>

By the way, there is a forthcoming paper in Computers and Geosciences at:

http://www.sciencedirect.com/science?_ob=ArticleURL&_udi=B6V7D-4MY0TX9-8&_user=10&_coverDate=01%2F30%2F2007&_alid=543285067&_rdoc=1&_fmt=summary&_orig=search&_cdi=5840&_sort=d&_docanchor=&view=c&_ct=6&_acct=C000050221&_version=1&_urlVersion=0&_userid=10&md5=898ebe84b51937827e5e0b86a348340c

(we only see the abstract without paying):

Implementation of a parallel high-performance visualization technique in
GRASS GIS by Alexandre Sorokine - the scripts are at:

http://www.ornl.gov/sci/gist/software/grass/

The approach involves tiling a region for visualization on a big wall from
a head node to rendering nodes.

Roger

On Wed, 28 Feb 2007, Jonathan Greenberg wrote:

I'm now working at UCD on a project with > 65 (!) hyperspectral images, and
a LOT of the slow downs stem from I/O intensive processes -- with that said,
there are plenty of "standard" raster processing routines that start
becoming processor intensive (e.g. linear spectral unmixing) that, with a
decent i/o setup (we have a lot of cheap-o 2-drive RAID0s that we back up to
a permanent location, this nearly doubles our I/O speed). Not having any
idea of how grass's raster access backbone functions, it does seem you could
have a generic tiling routine that could be called from all of the raster
commands that would be highly parallizable (I note that RSI ENVI does this
out of the box now, it detects the # of processors and simply starts forking
a single image across all of the processors, I assume one line/tile per
processor). Mapcalc and related processes would be a good start, and keep
in mind *spatial* analyses (e.g. texture windows) as you design it.

My suggestion to the GRASS programmers: if you decide this is useful, I'd
HIGHLY recommend trying to do this in an out-of-the-box approach, e.g.
having an install-time "mp" capability -- if it requires being a) experts in
raster processing, b) experts in GRASS and c) experts in parallel coding,
you'll end up with 3 people doing SMP :slight_smile:

--j

-----Original Message-----
From: grassuser-bounces@grass.itc.it [mailto:grassuser-bounces@grass.itc.it]
On Behalf Of Hamish
Sent: Wednesday, February 28, 2007 4:14 PM
To: Glynn Clements
Cc: grassuser@grass.itc.it
Subject: Re: [GRASS-user] 64 bit or parallel processing

Glynn Clements wrote:
> It would be more feasible to modify individual modules; e.g. a
> multi-threaded version of r.mapcalc might be feasible. But unless the
> raster I/O code was made thread-safe, you would still be limited by
> the rate at which map data could be read and written by a single
> thread, so it would only be worthwhile for cases where the bulk of the
> overhead was in the actual calculations.

Right, so working on individual modules which can typically take 10+
hours to run (*.rst, watersheds, viewsheds (incl. r.sun), indices from
RS data, etc) seems like a much better focus of effort for the greatest
gain:work ratio. And as seen in the MPI add-ons this is exactly where
the previous ad-hoc effort has gone.

IIRC Helena said that the RST modules have changed substantially sice
GRASS 5 and the s.surf.rst.mpi work would have to be largely redone.

Any thoughts on gains that could be made by MPI'ing the segmentation
lib? Do do modules using that usually do so for memory needs not
processing speed?

Hamish

_______________________________________________
grassuser mailing list
grassuser@grass.itc.it
http://grass.itc.it/mailman/listinfo/grassuser

_______________________________________________
grassuser mailing list
grassuser@grass.itc.it
http://grass.itc.it/mailman/listinfo/grassuser

--
Roger Bivand
Economic Geography Section, Department of Economics, Norwegian School of
Economics and Business Administration, Helleveien 30, N-5045 Bergen,
Norway. voice: +47 55 95 93 55; fax +47 55 95 95 43
e-mail: Roger.Bivand@nhh.no