[GRASS-dev] r.proj.seg disk space usage problem

I had a chance today to try a VERY large raster with r.proj.seg. 2.7GB raster, 55000x47000 cells. The memory issue of r.proj have now been moved to a HD space issue. With 7.75GB free, it chokes at about 75% in the allocating memory stage, while free space slowly drops to zero, with an "Error writing segment file".

It seems r.proj.seg, for its segmentation, creates an uncompressed copy of the whole raster on the HD (instead of in memory). I'm guessing this has something to do with random access speed in rasters?

So I'm back to projecting pieces of the raster, but now so they fit in my disk free space, and patching them together. Disappointing, but if that's the way it must be, I can live with it.

-----
William Kyngesburye <kyngchaos@kyngchaos.com>
http://www.kyngchaos.com/

"Mon Dieu! but they are all alike. Cheating, murdering, lying, fighting, and all for things that the beasts of the jungle would not deign to possess - money to purchase the effeminate pleasures of weaklings. And yet withal bound down by silly customs that make them slaves to their unhappy lot while firm in the belief that they be the lords of creation enjoying the only real pleasures of existence....

- the wisdom of Tarzan

William Kyngesburye wrote:

I had a chance today to try a VERY large raster with r.proj.seg.
2.7GB raster, 55000x47000 cells. The memory issue of r.proj have now
been moved to a HD space issue. With 7.75GB free, it chokes at about
75% in the allocating memory stage, while free space slowly drops to
zero, with an "Error writing segment file".
It seems r.proj.seg, for its segmentation, creates an uncompressed
copy of the whole raster on the HD (instead of in memory). I'm
guessing this has something to do with random access speed in rasters?

So I'm back to projecting pieces of the raster, but now so they fit
in my disk free space, and patching them together. Disappointing,
but if that's the way it must be, I can live with it.

did you pump the memory= option up to a few hundred MB smaller than your
installed RAM?

Hamish

On Mar 6, 2007, at 7:18 PM, Hamish wrote:

William Kyngesburye wrote:

I had a chance today to try a VERY large raster with r.proj.seg.
2.7GB raster, 55000x47000 cells. The memory issue of r.proj have now
been moved to a HD space issue. With 7.75GB free, it chokes at about
75% in the allocating memory stage, while free space slowly drops to
zero, with an "Error writing segment file".
It seems r.proj.seg, for its segmentation, creates an uncompressed
copy of the whole raster on the HD (instead of in memory). I'm
guessing this has something to do with random access speed in rasters?

So I'm back to projecting pieces of the raster, but now so they fit
in my disk free space, and patching them together. Disappointing,
but if that's the way it must be, I can live with it.

did you pump the memory= option up to a few hundred MB smaller than your
installed RAM?

I didn't try that. I'm not sure how that would help - that would give me 700MB, not a significant portion of the uncompressed raster. And it still needs to copy the raster to a temp file. From Glynn back on Feb 12:

Whereas the original readcell() function simply
read the map into memory (and thus needed to operate in the alternate
environment), the new readcell() copies it to a temporary file.

...

[Aside: my first attempt used a rowio-type strategy, but I discovered
that you can't switch projections while you have maps open; hence the
need to copy the map to a temporary file.]

-----
William Kyngesburye <kyngchaos@kyngchaos.com>
http://www.kyngchaos.com/

"Oh, look, I seem to have fallen down a deep, dark hole. Now what does that remind me of? Ah, yes - life."

- Marvin

William Kyngesburye wrote:

I had a chance today to try a VERY large raster with r.proj.seg.
2.7GB raster, 55000x47000 cells. The memory issue of r.proj have now
been moved to a HD space issue. With 7.75GB free, it chokes at about
75% in the allocating memory stage, while free space slowly drops to
zero, with an "Error writing segment file".

It seems r.proj.seg, for its segmentation, creates an uncompressed
copy of the whole raster on the HD (instead of in memory).

Correct.

55000 * 47000 * 4 bytes/cell = 10,340,000,000 bytes (~10GB).

I'm guessing this has something to do with random access speed in
rasters?

Correct. Re-projection involves accessing the source data in a
non-linear fashion, so the data needs to be stored in a form which
allows random access.

[We could potentially do something else for the case of
transformations with low levels of rotation and curvature, but I'm not
sure whether the case of not having enough free disk space to hold one
uncompressed map is sufficiently common to make it worthwhile.]

So I'm back to projecting pieces of the raster, but now so they fit
in my disk free space, and patching them together. Disappointing,
but if that's the way it must be, I can live with it.

Buy a larger hard drive. 250GB IDE = £43, 250GB SATA = £46, so that
extra 10GB equates to ~£5 worth of disk space. Which is a lot cheaper
than another 10GB of RAM (which you would need for the same task using
r.proj).

--
Glynn Clements <glynn@gclements.plus.com>

Hamish wrote:

> I had a chance today to try a VERY large raster with r.proj.seg.
> 2.7GB raster, 55000x47000 cells. The memory issue of r.proj have now
> been moved to a HD space issue. With 7.75GB free, it chokes at about
> 75% in the allocating memory stage, while free space slowly drops to
> zero, with an "Error writing segment file".
> It seems r.proj.seg, for its segmentation, creates an uncompressed
> copy of the whole raster on the HD (instead of in memory). I'm
> guessing this has something to do with random access speed in rasters?
>
> So I'm back to projecting pieces of the raster, but now so they fit
> in my disk free space, and patching them together. Disappointing,
> but if that's the way it must be, I can live with it.

did you pump the memory= option up to a few hundred MB smaller than your
installed RAM?

That won't affect disk usage, unless he has enough RAM to hold the
entire uncompressed map (~10GB in this case).

You need to be able to hold the entire uncompressed map (4 bytes per
cell) either in RAM or on disk. If it won't fit entirely in RAM, the
entire map is stored on disk, and portions are copied to RAM as
required. The portions which are in RAM are still kept on disk; there
isn't any "swapping".

You could conceivably use system swap as "RAM", but that would result
in taking physical RAM from any other applications on the system. At
best, the rest of the system is likely to become rather unresponsive
in that kind of situation; at worst, r.proj.seg and the rest of the
system will end up fighting it out for physical RAM, resulting in
"thrashing".

Also, that approach requires a 64-bit CPU and OS (in the sense of
providing a 64-bit address space). A 32-bit address space limits you
to no more than 4GB of "memory" (physical RAM, swap, mmap()d files
etc) per process, and the maximum size of the heap will probably be
significantly less than that 4GB.

--
Glynn Clements <glynn@gclements.plus.com>