[GRASS-dev] [bug #5252] (grass) Multicore hardware and grass

Davide_Spano · November 4, 2006, 2:53pm

Hi, I'm Davide Spano, student of Computer Science, Univesity of Pisa, I'd like to comment last Glynn mail

Version 10 sounds about right.

A lot of the problem is that:

1. The libraries can't readily be parallelised without changing the API.
2. Changing the API means re-writing modules which use it.
3. Much of GRASS' value is in the modules, so re-writing the modules
equates to re-writing most of GRASS.

That is dramatically true not only for GRASS, but also for all OS we use. The first try for parallelization is to introduce locks on olds APIs, this is not a good idea because the improvement is 1.5x max.
This phenomenom is known as software lockout: the CPUs spends more time in synchonization rather than calculation.
We did not use CPUs with more than two cores for this reason...

There might be some specific cases which are amenable to
parallelisation. E.g. it might be possible to re-write the core raster
I/O to use threads in a producer-consumer model, so that
get-row/put-row operations essentially take no time (i.e. the module
runs entirely in the main thread, while a separate thread performs the
raster I/O). That might give a 2x speed-up on a dual-core system, but
still wouldn't scale to larger numbers of cores (i.e. you would still
only get a 2x speed-up on a 16-core system).

Because this kinds of parallelisations are not the best solutions, I suggest first to read some literature that covers standartd techniques: farm, pipelining, map etc...
The real core of this problem is the interprocess communication, for 16x or more improvements we need to overlap IPC to internal calculation, and it is really possible only if the OS offers a support.

Davide