[GRASSLIST:276] Reduction of raster filesize by subsectioning into smaller maps?

Ok, I'll try to explain what I mean. I have certain bathyemtry rasters that
are pretty unusual in shape - see attached - this map pretty much maximizes
the null/data ratio for the given region which completely encloses it's
shape. So I have a couple questions:

The first question is to clarify my understanding of how nulls are
represented in Grass rasters.

1) Does an increase in the number of null cells within a region increase the
raster's file size on disk? If I increase the region 2x in both N-S and E-W
dimensions, but add no valid data only null values, I'm getting a bigger
raster regardless, correct?

2) If (1) is true, then can a raster such as the one in the attached image
be made smaller by tiling together a number of smaller regions that only
cover the areas with data? Or will the patched product of these smaller maps
only approximate (very closely) the total size of the original raster?

So, to sum in one sentence, I want to make a big diagonally-shaped raster
smaller by cheating and making a bunch or smaller maps that cover data areas
only (and subsequently have low null content) then patching the smaller ones
together. But I'm worried that this whole process just puts Humpty back
together again and I'm left with just as many nulls to data as I started
with.

I actually just developed a script that does this quite nicely already - the
user enters the number of rows and columns to section a raster into, then
interates through the raster, moving an r.mapcalc window through the
sectioned grid making a new map for each tile. A cool option I through in
was to accept a threshold percentage of null values that a candidate map can
have; if the region inside the current window has more than the cutoff
threshold, then this section is not exported, and the window moves on. But
if I can't get any savings in filesize by rejecting mostly null tiles, then
I'm not sure how useful this feature might be.

Do you guys think that this would be useful functionality for the Add-ons?
If so, I can post it up there for feedback/criticisms.

~ Eric.

(attachments)

Bathy_example.png

Patton, Eric wrote:

Ok, I'll try to explain what I mean. I have certain bathyemtry rasters that
are pretty unusual in shape - see attached - this map pretty much maximizes
the null/data ratio for the given region which completely encloses it's
shape. So I have a couple questions:

The first question is to clarify my understanding of how nulls are
represented in Grass rasters.

1) Does an increase in the number of null cells within a region increase the
raster's file size on disk? If I increase the region 2x in both N-S and E-W
dimensions, but add no valid data only null values, I'm getting a bigger
raster regardless, correct?

Correct. Although increases in horizontal resolution won't necessarily
produce proportional increases in file size, due to compression.
Increases in vertical resolution will.

2) If (1) is true, then can a raster such as the one in the attached image
be made smaller by tiling together a number of smaller regions that only
cover the areas with data? Or will the patched product of these smaller maps
only approximate (very closely) the total size of the original raster?

So, to sum in one sentence, I want to make a big diagonally-shaped raster
smaller by cheating and making a bunch or smaller maps that cover data areas
only (and subsequently have low null content) then patching the smaller ones
together. But I'm worried that this whole process just puts Humpty back
together again and I'm left with just as many nulls to data as I started
with.

Your understanding is correct.

--
Glynn Clements <glynn@gclements.plus.com>

Ok, I'll try to explain what I mean. I have certain bathyemtry rasters
that are pretty unusual in shape - see attached - this map pretty much
maximizes the null/data ratio for the given region which completely
encloses it's shape. So I have a couple questions:

Note CELL maps will be smaller than FCELL maps, which will be smaller
than DCELL maps. (ie int,float,double)

Glynn may correct me here, but IIRC, CELL maps with negative values will
be a fair size bigger than CELL maps without negative values, which is
something to consider when storing bathymetry data. By multiplying by -1
in r.mapcalc you may make the stored map size much smaller. This isn't
the case for FCELL & DCELL.

I actually just developed a script that does this quite nicely already
- the user enters the number of rows and columns to section a raster
into, then interates through the raster, moving an r.mapcalc window
through the sectioned grid making a new map for each tile. A cool
option I through in was to accept a threshold percentage of null
values that a candidate map can have; if the region inside the current
window has more than the cutoff threshold, then this section is not
exported, and the window moves on. But if I can't get any savings in
filesize by rejecting mostly null tiles, then I'm not sure how useful
this feature might be.

alt: write a script to save the region details (write a world file),
r.to.vect the cells, save only the non-null cells and maybe export to a
text file for gzipping. reverse to recover.

alt2: r.out.ascii or r.out.bin and Bzip2

It's just a matter of what's the least work I think.

Do you guys think that this would be useful functionality for the
Add-ons?

that's your call. If it works and is useful for you, it'll probably be
useful to someone else out there as well...

Hamish