[GRASS5] Parallel GRASS modules for high performance

Hi;

Some modules in GRASS, especially surface generation modules, takes long times according to the our data and our parameters.I am trying to write parallel versions of theese modules via MPI library, to be able to run GRASS in high performance parallel machines…

I have looked mailing list archives, and there is a people who is trying to do also same thing. But this mail was in 1993. I couldnt reach this person.

Now, I want to hear about your experiences. Does anybody try samething? or Does anybody can give me knowledge about the compilation of GRASS with this library (MPI), -mpcc must be used to compile-? Or do you have large data sets for surface generation.

Or which modules takes longs time in additon to the surface generation modules,
I mean minutes, hours or more.

Thank you for all

Your respectfully

Muzaffer Ayvaz


Yahoo! Music Unlimited - Access over 1 million songs. Try it free.

You must have missed a lot in your search - numerous parallel versions of interpolation and other modules
have been written since 1993, there should be a link to parallel idw on the grass web site and a parallel
version of the s.surf.rst module is here:
http://skagit.meas.ncsu.edu/~helena/grasswork/grasscontrib/
rstmods2fixed.tar.gz

The problem with these implementations is that unless the developer is committed to keeping them up to date
they die pretty quickly (e.g. the rst works with GRASS5 but not GRASS6).
So I have been begging everybody who tries to do parallel stuff for grass to do the parallelization
on top of the modules rather than within the modules so that they are minimally dependent on changes
within the modules. For example, v.surf.rst can be run efficiently by splitting region into smaller overlapping subregions
and sending each subregion to a different processor and then patch the results together. Same can be
done for r.mapcalc , r.slope.aspect and many other modules (there are some exceptions such as modules
that include flow routing). This may have its own problems but it is definitely more general and has much better
chance of surviving beyond one release cycle than writing a parallel version of a module.

I have plenty of large data sets (tens to hundreds of millions of points) but you need to get GRASS read them first.

Helena

On Nov 25, 2005, at 12:33 PM, Muzaffer Ayvaz wrote:

Hi;

Some modules in GRASS, especially surface generation modules, takes long times according to the our data and our parameters.I am trying to write parallel versions of theese modules via MPI library, to be able to run GRASS in high performance parallel machines..

I have looked mailing list archives, and there is a people who is trying to do also same thing. But this mail was in 1993. I couldnt reach this person.

Now, I want to hear about your experiences. Does anybody try samething? or Does anybody can give me knowledge about the compilation of GRASS with this library (MPI), -mpcc must be used to compile-? Or do you have large data sets for surface generation.

Or which modules takes longs time in additon to the surface generation modules,
I mean minutes, hours or more.

Tha! nk you for all

Your respectfully

Muzaffer Ayvaz

Yahoo! Music Unlimited - Access over 1 million songs. Try it free.

Helena Mitasova
Dept. of Marine, Earth and Atm. Sciences
1125 Jordan Hall, NCSU Box 8208,
Raleigh NC 27695
http://skagit.meas.ncsu.edu/~helena/

What are generally the things that change through release cycles?

our experience with very simple MPI parallelization is that it can be incorporated in a code relatively easily. The module may be compiled and run sequentially, but with some IFDEF conditions, the sequential loop maybe osculted and another version of the loop maybe compiled with MPI if configured with it. So in that case, the code is holding (though a bit longer) both types of infrastructures. People may port the sequential anytime, and if happens to have a “parallel” user, most likely the compilation trouble shooting will be limited to MPI code.
we did not try that IFDEF trick yet but it is something we want to end up doing with our MPI and NinfG GRASS modules.

0.02 cents

On 11/26/05, Helena Mitasova <hmitaso@unity.ncsu.edu> wrote:

You must have missed a lot in your search - numerous parallel versions
of interpolation and other modules
have been written since 1993, there should be a link to parallel idw on
the grass web site and a parallel
version of the s.surf.rst module is here:
http://skagit.meas.ncsu.edu/~helena/grasswork/grasscontrib/
rstmods2fixed.tar.gz

The problem with these implementations is that unless the developer is
committed to keeping them up to date
they die pretty quickly (e.g. the rst works with GRASS5 but not GRASS6).
So I have been begging everybody who tries to do parallel stuff for
grass to do the parallelization
on top of the modules rather than within the modules so that they are
minimally dependent on changes
within the modules. For example, v.surf.rst can be run efficiently by
splitting region into smaller overlapping subregions
and sending each subregion to a different processor and then patch the
results together. Same can be
done for r.mapcalc , r.slope.aspect and many other modules (there are
some exceptions such as modules
that include flow routing). This may have its own problems but it is
definitely more general and has much better
chance of surviving beyond one release cycle than writing a parallel
version of a module.

I have plenty of large data sets (tens to hundreds of millions of
points) but you need to get GRASS read them first.

Helena

On Nov 25, 2005, at 12:33 PM, Muzaffer Ayvaz wrote:

Hi;

Some modules in GRASS, especially surface generation modules, takes
long times according to the our data and our parameters.I am trying to
write parallel versions of theese modules via MPI library, to be able
to run GRASS in high performance parallel machines…

I have looked mailing list archives, and there is a people who is
trying to do also same thing. But this mail was in 1993. I couldnt
reach this person.

Now, I want to hear about your experiences. Does anybody try
samething? or Does anybody can give me knowledge about the compilation
of GRASS with this library (MPI), -mpcc must be used to compile-? Or
do you have large data sets for surface generation.

Or which modules takes longs time in additon to the surface generation
modules,
I mean minutes, hours or more.

Tha! nk you for all

Your respectfully

Muzaffer Ayvaz

Yahoo! Music Unlimited - Access over 1 million songs. Try it free.
Helena Mitasova
Dept. of Marine, Earth and Atm. Sciences
1125 Jordan Hall, NCSU Box 8208,
Raleigh NC 27695
http://skagit.meas.ncsu.edu/~helena/


grass5 mailing list
grass5@grass.itc.it
http://grass.itc.it/mailman/listinfo/grass5

On Nov 26, 2005, at 3:49 AM, Yann Chemin wrote:

What are generally the things that change through release cycles?

it can be a lot - especially for 5->6 site format used by the interpolation modules was retired,
point data are now handled as vector data and s.surf.rst was merged with v.surf.rst.
The latest parallelization of s.surf.rst that I mentioned (you can look at it, see the link below,
it was well done both for 2D and 3D using MPI) was done at the segmentation level -
each segment was sent to a different processor. Although the segmentation remains the same
I am not sure how much work it would be to update it to GRASS6 and there indeed needs to be some IFDEF
condition so that you don't end up with two versions of the same module (you can then be almost
sure that the parallel version will be the one that won't live too long).
GRASS6 -> GRASS7 may get a new raster format.

our experience with very simple MPI parallelization is that it can be incorporated in a code relatively easily. The module may be compiled and run sequentially, but with some IFDEF conditions, the sequential loop maybe osculted and another version of the loop maybe compiled with MPI if configured with it. So in that case, the code is holding (though a bit longer) both types of infrastructures. People may port the sequential anytime, and if happens to have a "parallel" user, most likely the compilation trouble shooting will be limited to MPI code.
we did not try that IFDEF trick yet but it is something we want to end up doing with our MPI and NinfG GRASS modules.

This sounds like a good idea, there is a lot of interest in parallel GRASS capabilities and eventually somebody
will find a solution that will be used beyond a single project - Good luck with your effort,

Helena

0.02 cents

On 11/26/05, Helena Mitasova <hmitaso@unity.ncsu.edu> wrote:

of interpolation and other modules
have been written since 1993, there should be a link to parallel idw on
the grass web site and a parallel
version of the s.surf.rst module is here:
http://skagit.meas.ncsu.edu/~helena/grasswork/grasscontrib/
rstmods2fixed.tar.gz

The problem with these implementations is that unless the developer is
committed to keeping them up to date
they die pretty quickly (e.g. the rst works with GRASS5 but not GRASS6).
So I have been begging everybody who tries to do parallel stuff for
grass to do the parallelization
on top of the modules rather than within the modules so that they are
minimally dependent on changes
within the modules. For example, v.surf.rst can be run efficiently by
splitting region into smaller overlapping subregions
and sending each subregion to a different processor and then patch the
results together. Same can be
done for r.mapcalc , r.slope.aspect and many other modules (there are
some exceptions such as modules
that include flow routing). This may have its own problems but it is
definitely more general and has much better
chance of surviving beyond one release cycle than writing a parallel
version of a module.

I have plenty of large data sets (tens to hundreds of millions of
points) but you need to get GRASS read them first.

Helena

On Nov 25, 2005, at 12:33 PM, Muzaffer Ayvaz wrote:

> Hi;
>
> Some modules in GRASS, especially surface generation modules, takes
> long times according to the our data and our parameters.I am trying to
> write parallel versions of theese modules via MPI library, to be able
> to run GRASS in high performance parallel machines..
>
> I have looked mailing list archives, and there is a people who is
> trying to do also same thing. But this mail was in 1993. I couldnt
> reach this person.
>
> Now, I want to hear about your experiences. Does anybody try
> samething? or Does anybody can give me knowledge about the compilation
> of GRASS with this library (MPI), -mpcc must be used to compile-? Or
> do you have large data sets for surface generation.
>
> Or which modules takes longs time in additon to the surface generation
> modules,
> I mean minutes, hours or more.
>
> Tha! nk you for all
>
> Your respectfully
>
> Muzaffer Ayvaz
>
> Yahoo! Music Unlimited - Access over 1 million songs. Try it free.
Helena Mitasova
Dept. of Marine, Earth and Atm. Sciences
1125 Jordan Hall, NCSU Box 8208,
Raleigh NC 27695
http://skagit.meas.ncsu.edu/~helena/

_______________________________________________
grass5 mailing list
grass5@grass.itc.it
http://grass.itc.it/mailman/listinfo/grass5

Helena Mitasova
Dept. of Marine, Earth and Atm. Sciences
1125 Jordan Hall, NCSU Box 8208,
Raleigh NC 27695
http://skagit.meas.ncsu.edu/~helena/