#2349: CELL raster format: make ZLIB level 3 standard compression instead of RLE
-------------------------+--------------------------------------------------
Reporter: neteler | Owner: grass-dev@…
Type: enhancement | Status: new
Priority: critical | Milestone: 7.0.0
Component: Raster | Version: svn-releasebranch70
Keywords: | Platform: All
Cpu: Unspecified |
-------------------------+--------------------------------------------------
At time, integer maps (CELL) are still compressed with RLE
This leads to a huge waste of disk space when it comes to large
data.
Proposal: make ZLIB, level 3 the standard compression.
At time we can enable the environment variable GRASS_INT_ZLIB
but it will use the default ZLIB level 6 compression which
is too CPU intensive. So a (user) control over this is important.
BTW: Manual of r.compress updated in r60814, needs to be backported.
#2349: CELL raster format: make ZLIB level 3 standard compression instead of RLE
-------------------------+--------------------------------------------------
Reporter: neteler | Owner: grass-dev@…
Type: enhancement | Status: new
Priority: critical | Milestone: 7.0.0
Component: Raster | Version: svn-releasebranch70
Keywords: | Platform: All
Cpu: Unspecified |
-------------------------+--------------------------------------------------
Comment(by glynn):
Replying to [ticket:2349 neteler]:
> At time, integer maps (CELL) are still compressed with RLE
> This leads to a huge waste of disk space when it comes to large
> data.
>
> Proposal: make ZLIB, level 3 the standard compression.
Is GRASS_INT_ZLIB support now old enough that it can be taken for granted?
> At time we can enable the environment variable GRASS_INT_ZLIB
> but it will use the default ZLIB level 6 compression which
> is too CPU intensive. So a (user) control over this is important.
The current behaviour is that setting GRASS_INT_ZLIB to anything (even an
empty string) will enable zlib compression at the hard-coded level. One
option is to parse the value as an integer and use the result as the
compression level. However, it's possible that people are currently using
e.g. GRASS_INT_ZLIB=1 to enable it with the existing default level.
Another option is to add another environment variable for the level.
Aside: if there are still systems out there using the historical limit of
4096 bytes of memory for the combination of environment variables and
arguments (argv), we might want to think about making GRASS less greedy
with respect to environment variables.
#2349: CELL raster format: make ZLIB level 3 standard compression instead of RLE
-------------------------+--------------------------------------------------
Reporter: neteler | Owner: grass-dev@…
Type: enhancement | Status: new
Priority: critical | Milestone: 7.0.0
Component: Raster | Version: svn-releasebranch70
Keywords: | Platform: All
Cpu: Unspecified |
-------------------------+--------------------------------------------------
Comment(by neteler):
Replying to [comment:2 glynn]:
> Replying to [ticket:2349 neteler]:
> > At time, integer maps (CELL) are still compressed with RLE
> > This leads to a huge waste of disk space when it comes to large
> > data.
> >
> > Proposal: make ZLIB, level 3 the standard compression.
>
> Is GRASS_INT_ZLIB support now old enough that it can be taken for
granted?
I hope yes. I am not aware of negative reports.
> > At time we can enable the environment variable GRASS_INT_ZLIB
> > but it will use the default ZLIB level 6 compression which
> > is too CPU intensive. So a (user) control over this is important.
>
> The current behaviour is that setting GRASS_INT_ZLIB to anything (even
an empty string) will enable zlib compression at the hard-coded level.
Exactly.
> One option is to parse the value as an integer and use the result as the
compression level. However, it's possible that people are currently using
e.g. GRASS_INT_ZLIB=1 to enable it with the existing default level.
>
> Another option is to add another environment variable for the level.
Yes, a new GRASS_ZLIBLEVEL may be less invasive.
> Aside: if there are still systems out there using the historical limit
of 4096 bytes of memory for the combination of environment variables and
arguments (argv), we might want to think about making GRASS less greedy
with respect to environment variables.
#2349: CELL raster format: make ZLIB level 3 standard compression instead of RLE
-------------------------+--------------------------------------------------
Reporter: neteler | Owner: grass-dev@…
Type: enhancement | Status: new
Priority: critical | Milestone: 7.0.0
Component: Raster | Version: svn-releasebranch70
Keywords: | Platform: All
Cpu: Unspecified |
-------------------------+--------------------------------------------------
Comment(by glynn):
Replying to [comment:3 neteler]:
> > Aside: if there are still systems out there using the historical limit
of 4096 bytes of memory for the combination of environment variables and
arguments (argv), we might want to think about making GRASS less greedy
with respect to environment variables.
>
> You mean the number and/or the length?
#2349: CELL raster format: make ZLIB level 3 standard compression instead of RLE
-------------------------+--------------------------------------------------
Reporter: neteler | Owner: grass-dev@…
Type: enhancement | Status: new
Priority: critical | Milestone: 7.0.0
Component: Raster | Version: svn-releasebranch70
Keywords: | Platform: All
Cpu: Unspecified |
-------------------------+--------------------------------------------------
Comment(by glynn):
Replying to [ticket:2349 neteler]:
> Proposal: make ZLIB, level 3 the standard compression.
r61380 implements the following behaviour:
* zlib compression is the default. Set GRASS_INT_ZLIB=0 to use RLE
compression.
* The compression level can be set via GRASS_ZLIB_LEVEL, whose value
should be an integer between 0 and 9. If not set (or if the value cannot
be parsed as an integer), zlib's default compression level will be used
(lib/gis/flate.c:333, if a different default is preferred).
#2349: CELL raster format: make ZLIB level 3 standard compression instead of RLE
-------------------------+--------------------------------------------------
Reporter: neteler | Owner: grass-dev@…
Type: enhancement | Status: new
Priority: blocker | Milestone: 7.0.0
Component: Raster | Version: svn-releasebranch70
Keywords: compression | Platform: All
Cpu: Unspecified |
-------------------------+--------------------------------------------------
Comment(by neteler):
Replying to [comment:5 glynn]:
> * The compression level can be set via GRASS_ZLIB_LEVEL, whose value
should be an integer between 0 and 9. If not set (or if the value cannot
be parsed as an integer), zlib's default compression level will be used
(lib/gis/flate.c:333, if a different default is preferred).
In relbranch7 there is currently:
Z_DEFAULT_COMPRESSION = 1 - "gives the best compromise between speed and
compression" as per r61424.
Perhaps zlib compression 1 should be adopted for trunk?
> Question: Would it be possible to compress also the null files, even
with just a
> weak compression?
Being uncompressed, the null files don't contain an index. The offset to
the beginning of a given row is obtained by multiplying the row number by
the number of bytes per row (which is just the number of columns divided
by 8, rounded upwards).
The main issue is likely to be the need to support both formats. We need
to
* Be able to read and write the uncompressed format, for compatibility
with existing versions of GRASS.
* Be able to distinguish between compressed and uncompressed formats on
read.
* Provide some mechanism (i.e. yet another environment variable) to
indicate which format to use on write.
back to this topic: Here on our system I found > 1.7TB of NULL files in a
single location, all
uncompressed.
What about having a "null2" file which is compressed and with index. If
present, fine, otherwise use the uncompressed well known null file format?
For backward compatibility, r.null could extended to convert from
compressed null2 to
uncompressed null (similar to v.build for the new spatial index in G7).
#2349: CELL raster format: make ZLIB level 3 standard compression instead of RLE
--------------------------+-------------------------------------------
Reporter: neteler | Owner: grass-dev@…
Type: enhancement | Status: new
Priority: critical | Milestone: 7.1.0
Component: Raster | Version: svn-trunk
Resolution: | Keywords: compression, r.compress, null
CPU: Unspecified | Platform: All
--------------------------+-------------------------------------------
Comment (by glynn):
Replying to [comment:15 neteler]:
> back to this topic: Here on our system I found > 1.7TB of NULL files in
a single location, all
> uncompressed.
How large are the null files compared to the cell/fcell files?
> What about having a "null2" file which is compressed and with index. If
present, fine, otherwise use the uncompressed well known null file format?
That's probably not a great deal of work, but as with any such change, we
need to consider the migration strategy. If we just start creating
compressed null files, mapsets will cease to be usable with older
versions.
#2349: CELL raster format: make ZLIB level 3 standard compression instead of RLE
--------------------------+-------------------------------------------
Reporter: neteler | Owner: grass-dev@…
Type: enhancement | Status: new
Priority: critical | Milestone: 7.1.0
Component: Raster | Version: svn-trunk
Resolution: | Keywords: compression, r.compress, null
CPU: Unspecified | Platform: All
--------------------------+-------------------------------------------
Comment (by neteler):
Replying to [comment:16 glynn]:
> How large are the null files compared to the cell/fcell files?
* With MODIS LST data, the null files are between 1.7 and 7.6 times
larger than the cell files (we store the LST maps in deg C * 100 as
integer to save disk space).
* With 100k random points, the null file is 7.1 times larger than the
fcell map
* With the EU 25m DEM, the null file is way smaller that the derived
aspect map (17% of the DEM fcell file)
> > What about having a "null2" file which is compressed and with index.
If present, fine, otherwise use the uncompressed well known null file
format?
>
> That's probably not a great deal of work, but as with any such change,
we need to consider the migration strategy. If we just start creating
compressed null files, mapsets will cease to be usable with older
versions.
Right but this could be covered with an addon/new script in G6 and earlier
(as v.build does for vector data).