On Mon, Jan 13, 2003 at 10:59:08PM -0800, j harrop wrote:
I should start by qualifying that we don't use Mosix, but I have a fairly
good guess about what its doing. We are running a more conventional
Beowulf and the programs we have parallelized were largely with MPI,
although we also run entire serial programs on multiple nodes - similar to
what Mosix is doing. The task Mosix takes on is to say the least
daunting. Having sequential codes run with automatic load
balancing across available, heterogeneous nodes is no small
undertaking! When the sequential codes make certain assumptions about
having a machine to themselves, the Mosix may have problems.
The cluster running here consists of 20 identical machines, so it
should be fine (should!). There seems to be some issues with NFS
and MFS but that's probably unrelated to GRASS.
I assume that you run a script with your "launcher" being called once per
image, and Mosix takes care of distributing this work across the nodes.
Right. We launch two GRASS jobs per node as there are two CPUs on each
machine.
I think what's happening is its doing this with a common binary AND a
common data system. This means that when the first launcher starts it begins
executing on node 1 with data shared across all nodes. Setting the region
is not problem and it goes on to process the first image. Perhaps, once
the region is read at the beginning of the processing routine, it is kept
in memory and the routine is able to complete regardless of what might
happen to the region data stored on disk.
This is unfortunately not true for GRASS.
The sequence
(
g.region something
i.smap something
r.colors comething
)
run as a job in parallel causes problems in a single mapset because
each command has some G_get_region() at the beginning which may be
already changed by another job.
The next instance of the
launcher does the same on node 2, but since the data is common, the region
is corrupted for node 1. The remaining routines effectively have bad
regions. That lag between setting the region and having it corrupted could
explain why the i.smap seems to generate mostly correct results.
Yes.
I presume you have added the various exports at the beginning of the script
to perform the equivalent of the grass5 command. (I thought about using
grass5 with command line settings in a script, but I gather that it starts
a new shell so grass5 cannot used in a script.)
You can simply set the variables in an own script.
I suspect you need the
export so that Mosix knows about the variables when it creates a new
environment on the remote nodes. I don't know exactly how Mosix decides
how much to run on each node. If only the i.smap is run on the other nodes
it would fail. PBS and other similar distributed systems have ways of
being told what environment variables need to be created on the
remote/slave nodes. Perhaps Mosix just copies the existing environment
from outside the launcher.The way I'm looking at running multiple nodes is to share the binaries by
NFS, but have local data. Then when you invoke a launcher, there is a
completely independent set of region and other system files. While one
part of the problem has become simpler, others have not. Your launcher
script would need to make some choices about how much to assign to each
node, and perhaps try to overlap communication and calculation by not
sending all the images before starting the processing.
But you need some more efforts to put the results together.
At time NFS causes some problems for us.
Alternatively, you
might use a master/slave load balancing strategy and only assign an image
when a node indicates that it has finished the previous one. I expect that
it would be quite difficult to get ideal load balancing by a priori
assignments across 20 nodes. But the speedup would be significant and the
loss in efficiency might be of more academic than practical interest.This would not be using Mosix in its best role, but if you can execute
commands on specific nodes and have rsync or equivalent I think you should
be able to use this approach. I'll let you know how ours goes. We are
under pressure currently to get ready for a Mining and Exploration
Conference in Vancouver, but I may have this running before that. It would
give me another interesting grass example for our booth at the trade show
part
Yes, please let me know later,
Regards,
Markus Neteler