[GRASS5] GRASS on OpenMOSIX cluster

Hi developers,

at time I am trying to run GRASS on a 20 node OpenMOSIX cluster.
The task to perform is to 'i.smap' 230 aerial images with 3 bands
each (later I'll try an improved method).

The main problem is that the various jobs seem to influence
each other (especially the current region). The result is

- probably correct CELL files
- damaged cell_hd files (zero size with a few exceptions)
- damaged color files (zero size with a few exceptions)
  (after i.smap I use r.colors)

The cluster job launcher script starts GRASS like this:

$1 contains the name of the aerial image

==================================================================
#!/bin/sh

#startup of GRASS (it wants the vars exported - why??):
export GISBASE=$HOME/grass500bin/grass5
export LOCATION_NAME=minipat

MAPSET=$USER
export GISDBASE=$HOME
#use unique GISRC file:
export GISRC=$HOME/.grassrc5.$$.$1

echo "LOCATION_NAME: $LOCATION_NAME" > $GISRC
echo "MAPSET: $MAPSET" >> $GISRC
echo "DIGITIZER: none" >> $GISRC
echo "GISDBASE: $GISDBASE" >> $GISRC

PATH=$PATH:$GISBASE/bin:$GISBASE/scripts

#for convenience:
LOCATION=$GISDBASE/$LOCATION_NAME/$MAPSET

ERR="./errlog.$1"
LOG="./log.$1"

g.mapsets map="PERMANENT,OFDC_PAT1999,OFDC_PAT1999_2,grass"
g.region rast=$1.b res=20 > $LOG 2> $ERR
i.group in=$1.b,$1.g,$1.r group=$1 subgroup=$1 >> $LOG 2>> $ERR

#cp known statistics from other group:
cp -r $LOCATION/group/spinale99/subgroup/spinale99/sigset $LOCATION/group/$1/subgroup/$1 >> $LOG 2>> $ERR
i.smap group=$1 subgroup=$1 sig=trainmap2.smap out=$1.smap >> $LOG 2>> $ERR

#write colors:
echo "1 green
2 blue
3 white
4 brown
5 yellow
6 red" | r.colors $1.smap col=rules >> $LOG 2>> $ERR

mkdir -p $LOCATION/cats/

#write cats:
echo "# 6 categories

0.00 0.00 0.00 0.00
1:bosco - forest
2:mugo - mugo pine
3:neve_roc - snow, bare rocks and roads
4:ombra - shadow
5:pasc_campi - agricultural patches and pastures
6:rodoreto - shrubs
" > $LOCATION/cats/$1.smap

==================================================================

I am a bit in the dark where to search for the problem/solution.
I remember that it was discussed once if two/multiple sessions of GRASS
are possible, but I cannot find that thread.

Thanks in advance

Markus

Markus Neteler wrote:

at time I am trying to run GRASS on a 20 node OpenMOSIX cluster.
The task to perform is to 'i.smap' 230 aerial images with 3 bands
each (later I'll try an improved method).

The main problem is that the various jobs seem to influence
each other (especially the current region). The result is

- probably correct CELL files
- damaged cell_hd files (zero size with a few exceptions)
- damaged color files (zero size with a few exceptions)
  (after i.smap I use r.colors)

I am a bit in the dark where to search for the problem/solution.
I remember that it was discussed once if two/multiple sessions of GRASS
are possible, but I cannot find that thread.

For the most part, you can run multiple sessions provided that they
each operate upon a different mapset. The primary exception is that
using monitors is problematic (monitor names are per-uid).

Having multiple sessions using the same mapset is high-risk. It could
theoretically be done, but to be safe, you would have to analyse the
behaviour of each command which is intended to be used to ensure that
no two processes attempt to modify the same file. The most likely
cause of conflict would be the WIND file.

For your specific case, SEARCH_PATH (written by g.mapsets) would also
be an issue. I don't know enough about the imagery programs to know
whether there would be any issues there.

One change which would allow conflicts to be reduced would be to
eliminate the use of getuid/geteuid/getpwuid, and use $USER/$HOME
instead. Relevant files include:

  src/libes/gis/mapset_msc.c
  src/libes/gis/set_prior.c
  src/libes/gis/user_config.c
  src/libes/gis/whoami.c
  src/libes/vask/V_support.c
  src/general/g.help/menu.c
  src/general/init/chk_dbase.c
  src/general/init/clean_temp.c

This would also be useful for Mike's work on a native Windows version.

--
Glynn Clements <glynn.clements@virgin.net>

On Tue, Jan 14, 2003 at 01:00:33AM +0000, Glynn Clements wrote:

Markus Neteler wrote:

> at time I am trying to run GRASS on a 20 node OpenMOSIX cluster.
> The task to perform is to 'i.smap' 230 aerial images with 3 bands
> each (later I'll try an improved method).
>
> The main problem is that the various jobs seem to influence
> each other (especially the current region).

[...]

For the most part, you can run multiple sessions provided that they
each operate upon a different mapset. The primary exception is that
using monitors is problematic (monitor names are per-uid).

Having multiple sessions using the same mapset is high-risk. It could
theoretically be done, but to be safe, you would have to analyse the
behaviour of each command which is intended to be used to ensure that
no two processes attempt to modify the same file. The most likely
cause of conflict would be the WIND file.

This was what I was doing (and which didn't work out).

Now I have rewritten the job launcher to run every job in a different
mapset (which is easy when you deal with many maps and the same operation
for all maps).

Result: it works well. At end of each job I switch to a common mapset
and copy over the map, then delete the temporary job mapset.

Now we only face NFS problems... when having the target mapset remotely
outside the cluster via NFS, the g.copy command only copies the cell file
properly, while cellhd/, cell_misc, colr and are empty (present, but 0
size). This will be related to openMOSIX. When keeping the target mapset on
local, it works perfectly.

Thanks for the mapset hint,

Markus