[GRASS5] GRASS on parallel CPUs

GRASS and parallel processing
                                                                                

From the current discussion it seems that GRASS

does not support parallel cpu's with much effort.
As many GRASS users might be interested in
using GRASS on clusters: I suggest to start a
project for this with own mailing list. Then
the current GRASS libs (or part of ot or specific
modules) could be re-designed for parallelized
computations.
                                                                                
If there is interest I will create a page for it
and the topic-oriented mailing list "parallelGRASS"
(or whatever).
                                                                                
It would be nice to have GRASS being the only GPL GIS
*and* usable on parallel cpu's.
                                                                                
Markus Neteler

----------------------------------------
If you want to unsubscribe from GRASS Development
Team internal mailing list write to:
minordomo@geog.uni-hannover.de with
subject 'unsubscribe grass5'
length: 2010
max: 0

On Wed, 26 Apr 2000, Markus Neteler wrote:

If there is interest I will create a page for it and the topic-oriented
mailing list "parallelGRASS" (or whatever).

Markus,

  I am certainly interested. However, I have two restrictions in the time I
could devote to it: 1) I'm already committed to enough with GRASS and my
business that I cannot take on any more now; 2) The only parallelizing
compilers I know cost a lot of money; they're too expensive for a very small
operation like ours.

Rich

Dr. Richard B. Shepard, President

                       Applied Ecosystem Services, Inc. (TM)
              Making environmentally-responsible mining happen. (SM)
                       --------------------------------
            2404 SW 22nd Street | Troutdale, OR 97060-1247 | U.S.A.
+ 1 503-667-4517 (voice) | + 1 503-667-8863 (fax) | rshepard@appl-ecosys.com

----------------------------------------
If you want to unsubscribe from GRASS Development
Team internal mailing list write to:
minordomo@geog.uni-hannover.de with
subject 'unsubscribe grass5'
length: 1846
max: 0

Rich,

On Wed, Apr 26, 2000 at 06:21:27AM -0700, Rich Shepard wrote:

On Wed, 26 Apr 2000, Markus Neteler wrote:

[...]

business that I cannot take on any more now; 2) The only parallelizing
compilers I know cost a lot of money; they're too expensive for a very small
operation like ours.

As far as I understand it is not a question of having a good
compiler but a question of a re-write of GRASS libraries/modules.
But I am not familiar with that.

Markus

----------------------------------------
If you want to unsubscribe from GRASS Development
Team internal mailing list write to:
minordomo@geog.uni-hannover.de with
subject 'unsubscribe grass5'
length: 1291
max: 0

On Wed, 26 Apr 2000, Markus Neteler wrote:

As far as I understand it is not a question of having a good compiler but
a question of a re-write of GRASS libraries/modules. But I am not familiar
with that.

Markus,

  I believe that there's more to it than that. The compiler has to know what
can be run in parallel and what cannot. The compiler also makes decisions on
which loops to unroll and how to optimize the code by inlining functions,
among other things. There are differences in the debuggers and profilers,
too. Now I've reached the end of my knowledge about modifying code to run on
multiple processors and/or clusters. :slight_smile:

  The other issue is the need for a high performance Fortran compliler along
with the C compiler. SWAT, for example, is written in Fortran and that's
probably where a lot of benefits from parallelization will be realized. For
example, We build a model of the Willamette River basin (from the Columbia
River at Portland on the north to below Eugene in the Siskiyou Mountains
about 200 miles the south) and define 300 subbasins. With a single processor
system, each subbasin is computed sequentially. With a 6-processor cluster,
a computational "run" would be 50 times faster because each processor would
run the same thread on a different subbasin.

  The Portland Group's cluster development kit (CDK) costs US$2,500, plus
$500 per year for minor upgrades and support. Cough! Cough! Compared with
the cost of GRASS and linux, this is equivalent to putting a US$100,000 mass
spectrophotometer as a detector on the back end of a $2,000 gas
chromatograph.

  That's my $0.25's worth (inflation, you know).

Rich

Dr. Richard B. Shepard, President

                       Applied Ecosystem Services, Inc. (TM)
              Making environmentally-responsible mining happen. (SM)
                       --------------------------------
            2404 SW 22nd Street | Troutdale, OR 97060-1247 | U.S.A.
+ 1 503-667-4517 (voice) | + 1 503-667-8863 (fax) | rshepard@appl-ecosys.com

----------------------------------------
If you want to unsubscribe from GRASS Development
Team internal mailing list write to:
minordomo@geog.uni-hannover.de with
subject 'unsubscribe grass5'
length: 2999
max: 0

Markus,

I have ported code for both the Cray and the Paragon
and in both cases modification of the code was required.
Now if a native compiler such as for Solaris or Hp/Ux
can do this automatically in the compile stage, then great.
In fact now that multiple CPUs are becoming cheaper
and less expensive to maintain, native compilers may
come about in several years to address this automatically.

John Huddleston

----- Original Message -----
From: Markus Neteler <neteler@geog.uni-hannover.de>
To: <grass5@geog.uni-hannover.de>
Sent: Wednesday, April 26, 2000 8:18 AM
Subject: Re: [GRASS5] GRASS on parallel CPUs

Rich,

On Wed, Apr 26, 2000 at 06:21:27AM -0700, Rich Shepard wrote:
> On Wed, 26 Apr 2000, Markus Neteler wrote:
[...]
> business that I cannot take on any more now; 2) The only parallelizing
> compilers I know cost a lot of money; they're too expensive for a very

small

> operation like ours.

As far as I understand it is not a question of having a good
compiler but a question of a re-write of GRASS libraries/modules.
But I am not familiar with that.

Markus

----------------------------------------
If you want to unsubscribe from GRASS Development
Team internal mailing list write to:
minordomo@geog.uni-hannover.de with
subject 'unsubscribe grass5'
length: 1291
max: 0

----------------------------------------
If you want to unsubscribe from GRASS Development
Team internal mailing list write to:
minordomo@geog.uni-hannover.de with
subject 'unsubscribe grass5'
length: 2954
max: 0

Hi all,

True parallel processing will need a rewrite of the libes as Markus
points out. However, we have a version of Sun Workshop that
*supposedly* will optimize for parallel processors without major
re-working of the code. I'll dig into this and see. If so, we'll load
it up in the next day or so and give it a try. We've got a multi-proc
server we can test on (Solaris 7).

If anyone's interested in working with this, let me know.

Bruce

--
Bruce Byars
GRASS Dev. Team
CAGSR - Baylor Univ.

Markus Neteler wrote:

Rich,

On Wed, Apr 26, 2000 at 06:21:27AM -0700, Rich Shepard wrote:
> On Wed, 26 Apr 2000, Markus Neteler wrote:
[...]
> business that I cannot take on any more now; 2) The only parallelizing
> compilers I know cost a lot of money; they're too expensive for a very small
> operation like ours.

As far as I understand it is not a question of having a good
compiler but a question of a re-write of GRASS libraries/modules.
But I am not familiar with that.

Markus

----------------------------------------
If you want to unsubscribe from GRASS Development
Team internal mailing list write to:
minordomo@geog.uni-hannover.de with
subject 'unsubscribe grass5'
length: 1291
max: 0

----------------------------------------
If you want to unsubscribe from GRASS Development
Team internal mailing list write to:
minordomo@geog.uni-hannover.de with
subject 'unsubscribe grass5'
length: 2381
max: 0

Hi all,

i am too not an expert on parallel computing.
But i will try to explain what i learned from the web and what i know
from my computer expirience.

Parallel Computing and Clustering is a very complex topic and as far as
i know the computer industry changed their attitude and the models many
times.

Yo have to distinguish between SMP (symmetrical multiprocessing, several
CPUs share one RAM in one system), closely coupled Cluster Computers
(several CPUs have each their own Memory and are connected via a
High-Speed Bus system) and loosely coupled Clusters (many interconnected
Computers, like Beowulf Clusters which are standard industry PCs
connected via a 100MBit Backbone/Switch).

There are many other, more complicated systems which i will skip for
now.

SMP-Workstations/Servers can run different processes on different CPUs
if they use threads and the software is thread-safe and written for a
library that enables this (pthread-library on linux for example). This
is comparatively easy to achieve if the software starts a new process
for every connection like web-servers or database-servers. Most
commercial software uses this, but the performance gain may be low if
the application itself is not suited for parallel computing (like
word-processing). The bottle-neck with this is in the bus (memory
access, graphics access, disk) of the computer. Intel PCs are very
limited in bus bandwith compared to unix machines (Sun, SGI, others).
Software that does not use threads runs on those machines, but not
faster than on a single CPU machine.

For parallel cluster computers you will need a complete re-thinking of
the problem/algorithm and to rewrite the application to use special
message passing libraries. The main problem with parallel computing is
the communication of the different nodes in the cluster. If the problem
is very well suited to parallel computing (e. g. the famous rendering of
parts of a scene for the titanic movie) the performance will increase
extremly (in essence this means that simply parts of the image are
independently calculated on different computers). But if the problem
requires much communication and much data transport between the nodes,
the performance may sink below the performance of a single-CPU-system.

The idea behind the commercial Compilers is that tricky optimization of
existing code for parallel computers will give _some_ increase in
processing speed. The Portland group claims that their Fortran-Compiler
gives a 30% increase in speed (as i understand regardless if running on
a parallel machine or not).

For commercial use this may be an economic solution to speed up existing
applications. You can calculate how much money you have to spend on
programming to get the same increase.
  
I suspect that the commercial compilers will do a comparatively bad job
on optimizing the GRASS code as the libraries are not developed with
parallel computing in mind. Most computing-intensive raster calculations
are done via a conventional loop over each row and the compiler can not
anticipate which neighborhood-calculations are done at run-time. Same
with the segment library. I think that in image processing and in
simulations based on GRASS raster data some remarkable speedup is
possible.

The usual compilers do a very good job on optimizing code (unrolling
loops etc.) but this is different from an optimization for parallel
processing. I think the difference between gcc and the commercial
compilers is due to the closer relation to the machine/processor (but
the Solaris compiler produces only about 15 % faster binaries than
gcc!).

The Beowulf clusters use a standard Red Hat distribution as base (with
gcc/egcs compiler etc.) and use a message passing library for software
development and tools to manage >16 PCs (nodes) and to control the
distribution of data/tasks. The software is developed on this platform
and i think that it would be very complicated even to interface the
GRASS GIS library to this setup. But you could simply split up your job
(e. g. subbasins, parts of a raster) and run the computation as a batch
job on different machines and patch together the results afterwards. But
this will give you a speedup with x times CPU (e. g. 6 nodes, 6 times
faster than on single processor). You have to subtract your time to
split, distribute and patch together the data.

To sum up:
I think that if you have a problem that requires _very_ much computing
power and you can not solve this with a conventional setup (800 MHz
Intel system or a really fast unix server) you should evaluate if a
cluster will solve it. But then you will have to invest in the machine
_and_ in programming (and/or in a professional compiler/library).

If compared to nowadays PC prices $2500 looks much. But if you consider
that _real_ computers in the upper range still cost about $50 000 this
is not a real invest. And you should consider what program development
costs, $2500 spent for 30% increase in performance is IMHO a good
investment.

But for GRASS i can see no real applicability, if using a commercial
compiler you would have to distribute binaries, for different
architectures/setups. Only a small proportion of the GRASS users (those
with a Beowulf cluster in their garage) would benefit from this.

Thats just what i think,

cu

Andreas

--
Andreas Lange, 65187 Wiesbaden, Germany, Tel. +49 611 807850
Andreas.Lange@Rhein-Main.de, A.C.Lange@GMX.net

----------------------------------------
If you want to unsubscribe from GRASS Development
Team internal mailing list write to:
minordomo@geog.uni-hannover.de with
subject 'unsubscribe grass5'
length: 6805
max: 0