I've a couple of very large DEMs; approximately 40G cells each. I am
reprojecting them to the project's location, but they take ~10 hours to
complete. One's running on my desktop (dual-core AMD II X2 CPU with 4G RAM),
the other's running on my Dell Latitude E5410 (quad-core Intel i7 with 8G
RAM). Grass-6.5svn is compiled with openmp on both machines.
Is there any way to make use of the multiple cores and (comparatively)
large amounts of memory to speed up the reprojection process? I know that
I'll have the same issue with other projects so, while I can certainly
continue to run other applications and processes while these slowly grind
away, I'd like to learn if I can shorten the required times.
Rich
I thinkg you have to use "poors man parallelization"... That is, split
the work and issue multiple commands separately.
So I'd try splitting the large raster into small chunks and then
projecting each one separately, sending the project command to the
background. The problem is that, if the grass command changes the
region settings, things might not work. So maybe, your best bet would
be to run the projection in small chunks of the raster file but
outside of grass, using gdalwarp
Take a look at the parallel grass jobs wiki.
http://grass.osgeo.org/wiki/Parallel_GRASS_jobs
Cheers
daniel
On Sat, Mar 10, 2012 at 2:25 PM, Rich Shepard <rshepard@appl-ecosys.com> wrote:
I've a couple of very large DEMs; approximately 40G cells each. I am
reprojecting them to the project's location, but they take ~10 hours to
complete. One's running on my desktop (dual-core AMD II X2 CPU with 4G RAM),
the other's running on my Dell Latitude E5410 (quad-core Intel i7 with 8G
RAM). Grass-6.5svn is compiled with openmp on both machines.
Is there any way to make use of the multiple cores and (comparatively)
large amounts of memory to speed up the reprojection process? I know that
I'll have the same issue with other projects so, while I can certainly
continue to run other applications and processes while these slowly grind
away, I'd like to learn if I can shorten the required times.
Rich
_______________________________________________
grass-user mailing list
grass-user@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/grass-user
On Sat, 10 Mar 2012, Daniel Victoria wrote:
Take a look at the parallel grass jobs wiki.
http://grass.osgeo.org/wiki/Parallel_GRASS_jobs
daniel,
Thank you. It will be quite nice when grass can be completely compiled for
parallel operations.
Rich
On 10/03/2012 20:07, Daniel Victoria wrote:
> ...The problem is that, if the grass command changes the
> region settings, things might not work.
You can start additional GRASS sessions in other mapsets, change the region there, then work in parallel on various chunks of the big map.
Hermann
Daniel Victoria wrote:
I thinkg you have to use "poors man parallelization"... That is, split
the work and issue multiple commands separately.
So I'd try splitting the large raster into small chunks and then
projecting each one separately, sending the project command to the
background. The problem is that, if the grass command changes the
region settings, things might not work.
r.proj doesn't change the region.
Processing the map in chunks requires setting a different region for
each command. That can be done by creating named regions and using the
WIND_OVERRIDE environment variable, e.g.:
g.region ... save=region1
g.region ... save=region2
...
WIND_OVERRIDE=region1 r.proj ... &
WIND_OVERRIDE=region2 r.proj ... &
...
The main factor which is likely to affect parallelism is the fact that
the processes won't share their caches, so there'll be some degree of
inefficiency if there's substantial overlap between the source areas
for the processes.
If you have more than one such map to project, processing entire maps
in parallel might be a better choice (so that you get N maps projected
in 10 hours rather than 1 map in 10/N hours).
--
Glynn Clements <glynn@gclements.plus.com>