[GRASS-dev] Testing the new ZSTD raster compression - was: Re: GRASS GIS raster files: LZW compression?

Hi,

(subject renamed for clarity, original thread:
part 1: https://lists.osgeo.org/pipermail/grass-dev/2017-October/thread.html#86395
part 2: https://lists.osgeo.org/pipermail/grass-dev/2017-December/086738.html
)

Here the benchmarking with SRTM 30m global with 5.4 billion pixels (run on SSD disk):

Welcome to GRASS GIS 7.5.svn (r71892)
GRASS 7.5.svn (latlong):~ >

export GRASS_COMPRESSOR=ZSTD
time -p r.in.gdal /vsicurl/https://www.datenatlas.de/geodata/public/srtmgl1/srtmgl1.003.tif output=srtmgl1_v003_30m_ZSTD memory=3000
360 degree EW extent is exceeded by 0.0395247 cells
360 degree EW extent is exceeded by 1 cells
Importing raster map <srtmgl1_v003_30m_ZSTD>…
100%
real 17697.11
user 17149.13
sys 195.68

g.region raster=srtmgl1_v003_30m_ZSTD -p
360 degree EW extent is exceeded by 0.0395247 cells
360 degree EW extent is exceeded by 1 cells
360 degree EW extent is exceeded by 1 cells
360 degree EW extent is exceeded by 1 cells
projection: 3 (Latitude-Longitude)
zone: 0
datum: wgs84
ellipsoid: wgs84
north: 60:00:00.5N
south: 56:00:00.5S
west: 180:00:00.5W
east: 180:00:00.5E
nsres: 0:00:01
ewres: 0:00:01
rows: 417601
cols: 1296001
cells: 541211313601
360 degree EW extent is exceeded by 1 cells

r.colors srtmgl1_v003_30m_ZSTD color=srtm_plus
Color table for raster map <srtmgl1_v003_30m_ZSTD> set to ‘srtm_plus’

r.compress -p srtmgl1_v003_30m_ZSTD
360 degree EW extent is exceeded by 1 cells
<srtmgl1_v003_30m_ZSTD> is compressed (method 5: ZSTD). Data type: CELL
<srtmgl1_v003_30m_ZSTD> has a compressed NULL file

File size comparison:

ZLIB compressed, human readable:

mundialis:~/grassdata/latlong/srtmgl1_30m$ find . -name srtmgl1_v003_30m | sort | xargs du -h
4,0K ./cats/srtmgl1_v003_30m
4,0K ./cellhd/srtmgl1_v003_30m
293M ./cell_misc/srtmgl1_v003_30m
157G ./cell/srtmgl1_v003_30m
4,0K ./colr/srtmgl1_v003_30m
4,0K ./hist/srtmgl1_v003_30m

ZLIB compressed, kB:

mundialis:~/grassdata/latlong/srtmgl1_30m$ find . -name srtmgl1_v003_30m | sort | xargs du -k
4 ./cats/srtmgl1_v003_30m
4 ./cellhd/srtmgl1_v003_30m
299696 ./cell_misc/srtmgl1_v003_30m
163988608 ./cell/srtmgl1_v003_30m
4 ./colr/srtmgl1_v003_30m
4 ./hist/srtmgl1_v003_30m

ls -la cell_misc/srtmgl1_v003_30m/
total 299696
-rw-rw-r-- 1 mundialis mundialis 306883360 Okt 25 21:19 nullcmpr
-rw-rw-r-- 1 mundialis mundialis 13 Okt 25 21:19 range

ZSTD compressed, kB:

mundialis:/scratch/grassdata/latlong/srtmgl1_30m$ find . -name srtmgl1_v003_30m_ZSTD | sort | xargs du -k
4 ./cats/srtmgl1_v003_30m_ZSTD
4 ./cellhd/srtmgl1_v003_30m_ZSTD
299708 ./cell_misc/srtmgl1_v003_30m_ZSTD
145278592 ./cell/srtmgl1_v003_30m_ZSTD
4 ./colr/srtmgl1_v003_30m_ZSTD
4 ./hist/srtmgl1_v003_30m_ZSTD

ls -la cell_misc/srtmgl1_v003_30m_ZSTD/
total 299704
-rw-rw-r-- 1 mundialis mundialis 306883360 Dez 5 21:40 nullcmpr
-rw-rw-r-- 1 mundialis mundialis 22 Dez 5 21:40 stats
-rw-rw-r-- 1 mundialis mundialis 13 Dez 5 21:40 range

Ratio:

CELL file

145278592 / 163988608
[1] 0.8859066

null file

identical

Curiosity: would it makes sense to compress the “nullcmpr” file according to the selected compression? Probably Markus M had already explained it but I don’t remember…

Best,
markusN


Markus Neteler, PhD
http://www.mundialis.de - free data with free software
http://grass.osgeo.org
http://courses.neteler.org/blog

On Wed, Dec 6, 2017 at 10:41 AM, Markus Neteler <neteler@osgeo.org> wrote:

[…]

null file

identical

Curiosity: would it makes sense to compress the “nullcmpr” file according to the selected compression? Probably Markus M had already explained it but I don’t remember…

In short, it’s a waste of time.

Null files are very small with one bit per cell. Small data are difficult to compress, and using a stronger compression method might actually produce a larger, not a smaller output, and it takes longer. Regarding null file compression, LZ4 is not only the fastest method, it is also regularly the method with the best compression ratio.

Null files and [f]cell files have different characteristics regarding compression, therefore one compression method can perform best for cell values and another one might perform best for null file compression. You would need to be able to choose separate compressors for the actual data and the null file in order to achieve maximum compression or the best compromise between speed and compression. That’s too complicated. Most of the time LZ4 is a good if not the best choice, therefore null file compression is fixed to LZ4.

Markus M

Best,
markusN


Markus Neteler, PhD
http://www.mundialis.de - free data with free software
http://grass.osgeo.org
http://courses.neteler.org/blog


grass-dev mailing list
grass-dev@lists.osgeo.org
https://lists.osgeo.org/mailman/listinfo/grass-dev

Markus Neteler wrote

Hi,

(subject renamed for clarity, original thread:
part 1:
https://lists.osgeo.org/pipermail/grass-dev/2017-October/thread.html#86395
part 2:
https://lists.osgeo.org/pipermail/grass-dev/2017-December/086738.html
)

after little testing, I succeed to compile winGRASS with ZSTD-support :wink:

Fri Dec 8 19:10:32 CET 2017: STARTING dll.to.a
[...]
/c/OSGeo4W64/bin/libzstd.dll => mswindows/osgeo4w/lib/libzstd
* [C:/OSGeo4W64/bin/libzstd.dll] Found PE+ image
[...]
checking host system type... x86_64-w64-mingw32
checking for gcc... gcc
[...]
checking whether to use zstd... yes
checking for location of zstd includes...
checking for zstd.h... yes
checking for location of zstd library...
checking for ZSTD_compress in -lzstd... yes
[...]
GRASS is now configured for: x86_64-w64-mingw32

  Source directory: /usr/src/grass_trunk
  Build directory: /usr/src/grass_trunk
  Installation directory: ${prefix}/grass-7.5.svn
  Startup script in directory:/c/OSGeo4W64/bin
  C compiler: gcc -g -O2
  C++ compiler: c++ -g -O2
  Building shared libraries: yes
  OpenGL platform: Windows

  MacOSX application: no
  MacOSX architectures:
  MacOSX SDK:

  BLAS support: no
  BZIP2 support: yes
  C++ support: yes
  Cairo support: yes
  DWG support: no
  FFTW support: yes
  FreeType support: yes
  GDAL support: yes
  GEOS support: yes
  LAPACK support: no
  Large File support (LFS): yes
  libLAS support: yes
  MySQL support: no
  NetCDF support: no
  NLS support: yes
  ODBC support: yes
  OGR support: yes
  OpenCL support: no
  OpenGL support: yes
  OpenMP support: no
  PDAL support: no
  PNG support: yes
  POSIX thread support: no
  PostgreSQL support: yes
  Readline support: no
  Regex support: yes
  SQLite support: yes
  TIFF support: yes
  X11 support: no
  Zstandard support: yes <=

:slight_smile:

steps:

- compiled zstd-1.3.2 in MSYS2 (other possibility would be to use the
precompiled zstd-binaries from the zstd project)

- added manually exe and dll and the header of zstd to the OSGeo4W-tree

- adapted grass_trunk\mswindows\osgeo4w\package.sh

that's all.

all what would be needed is to include zstd into the OSGeo4W tree. will open
a ticket there.

-----
best regards
Helmut
--
Sent from: http://osgeo-org.1560.x6.nabble.com/Grass-Dev-f3991897.html

all what would be needed is to include zstd into the OSGeo4W >tree. will

open

a ticket there.

for the record:

https://lists.osgeo.org/pipermail/osgeo4w-dev/2017-December/003453.html

thanks to Jürgen F. zstd is now included in OSGeo4W, now the changes in the
GRASS source/winGRASS compiling script are needed.

-----
best regards
Helmut
--
Sent from: http://osgeo-org.1560.x6.nabble.com/Grass-Dev-f3991897.html

-----
best regards
Helmut
--
Sent from: http://osgeo-org.1560.x6.nabble.com/Grass-Dev-f3991897.html

Helmut Kudrnovsky wrote

all what would be needed is to include zstd into the OSGeo4W >tree. will

open

a ticket there.

for the record:

https://lists.osgeo.org/pipermail/osgeo4w-dev/2017-December/003453.html

thanks to Jürgen F. zstd is now included in OSGeo4W, now the changes in
the
GRASS source/winGRASS compiling script are needed.

for the record: zstd should be enabled also for winGRASS by r71912 and
r71913

-----
best regards
Helmut
--
Sent from: http://osgeo-org.1560.x6.nabble.com/Grass-Dev-f3991897.html

Helmut Kudrnovsky wrote

Helmut Kudrnovsky wrote

all what would be needed is to include zstd into the OSGeo4W >tree. will

open

a ticket there.

for the record:

https://lists.osgeo.org/pipermail/osgeo4w-dev/2017-December/003453.html

thanks to Jürgen F. zstd is now included in OSGeo4W, now the changes in
the
GRASS source/winGRASS compiling script are needed.

for the record: zstd should be enabled also for winGRASS by r71912 and
r71913

-----
best regards
Helmut
--
Sent from: http://osgeo-org.1560.x6.nabble.com/Grass-Dev-f3991897.html
_______________________________________________
grass-dev mailing list

grass-dev@.osgeo

https://lists.osgeo.org/mailman/listinfo/grass-dev

https://wingrass.fsv.cvut.cz/grass75/x86/logs/log-r71915-25/package.log

looks good, zstd enabled in winGRASS.

-----
best regards
Helmut
--
Sent from: http://osgeo-org.1560.x6.nabble.com/Grass-Dev-f3991897.html

* Markus Neteler <neteler@osgeo.org> [2017-12-06 10:41:19 +0100]:

Hi,

(subject renamed for clarity, original thread:
part 1:
https://lists.osgeo.org/pipermail/grass-dev/2017-October/thread.html#86395
part 2:
https://lists.osgeo.org/pipermail/grass-dev/2017-December/086738.html
)

[..]

Welcome to GRASS GIS 7.5.svn (r71892)
GRASS 7.5.svn (latlong):~ >

export GRASS_COMPRESSOR=ZSTD

[..]

I am building a docker image, based on trunk, for which I need to ensure
that ZSTD is the default compressor.

What is the way to pre-set this at the configuration step, before
compiling, so as to avoid to take care about it later on by setting
GRASS_COMPRESSOR?

Thanks for any kind of comments, Nikos

On Sat, Jun 9, 2018 at 8:26 AM, Nikos Alexandris <nik@nikosalexandris.net> wrote:

Hi,

(subject renamed for clarity, original thread:
part 1:
https://lists.osgeo.org/pipermail/grass-dev/2017-October/thread.html#86395
part 2:
https://lists.osgeo.org/pipermail/grass-dev/2017-December/086738.html
)

[…]

Welcome to GRASS GIS 7.5.svn (r71892)
GRASS 7.5.svn (latlong):~ >

export GRASS_COMPRESSOR=ZSTD

[…]

I am building a docker image, based on trunk, for which I need to ensure
that ZSTD is the default compressor.

What is the way to pre-set this at the configuration step, before
compiling, so as to avoid to take care about it later on by setting
GRASS_COMPRESSOR?

ZSTD is already the default compressor if available, no further settings needed.

Markus M

Thanks for any kind of comments, Nikos

Nikos:

I am building a docker image, based on trunk, for which I need to ensure
that ZSTD is the default compressor.

What is the way to pre-set this at the configuration step, before
compiling, so as to avoid to take care about it later on by setting
GRASS_COMPRESSOR?

Markus M:

ZSTD is already the default compressor if available, no further settings
needed.

Great!

However, descriptions of compressors in the raster indroduction for
trunk [0], are maybe confusing.

ZLIB
ZLIB's deflate is the default compression method for all raster maps.

and

ZSTD
... ZSTD is the recommended default compression method.

The same piece of text is used for `r.compress` [1].
[0] https://grass.osgeo.org/grass75/manuals/rasterintro.html
[1] https://grass.osgeo.org/grass75/manuals/r.compress.html

I realised this before commenting in #3499. Yet, I thought "default" is ZLIB, and "recommended default" (like: "dear user,
it's best if you set ZSTD as the default") is ZSTD.

Nikos

On Sun, Jun 10, 2018 at 8:44 PM, Nikos Alexandris <nik@nikosalexandris.net> wrote:

Nikos:

I am building a docker image, based on trunk, for which I need to ensure
that ZSTD is the default compressor.

What is the way to pre-set this at the configuration step, before
compiling, so as to avoid to take care about it later on by setting
GRASS_COMPRESSOR?

Markus M:

ZSTD is already the default compressor if available, no further settings
needed.

Great!

However, descriptions of compressors in the raster indroduction for
trunk [0], are maybe confusing.

ZLIB
ZLIB's deflate is the default compression method for all raster maps.

and

ZSTD
... ZSTD is the recommended default compression method.

The same piece of text is used for r.compress [1].
[0] https://grass.osgeo.org/grass75/manuals/rasterintro.html
[1] https://grass.osgeo.org/grass75/manuals/r.compress.html

I realised this before commenting in #3499. Yet, I thought “default” is ZLIB, and “recommended default” (like: "dear user,

it’s best if you set ZSTD as the default") is ZSTD.

It’s trunk, and a new feature is being tested in trunk, therefore I made it the default compression method if available. I was not sure if ZSTD should become the new default compression if available, but after creating thousands of raster maps with ZSTD compression on different systems, I think it is safe to make ZSTD the new default if available. The rasterintro and the manual for r.compress have been updated accordingly in trunk r72794.

ZSTD is by now widely used by many different projects, also by GDAL. Considering the performance improvement of ZSTD over ZLIB, it is worth to consider if ZSTD should become a requirement for the next minor release (GRASS 7.6), granted that ZSTD is available on all supported systems.

Markus M

Nikos

but after creating thousands of raster maps with ZSTD >compression on

different systems,

tested zstd on Windows for some time now, no problems so far.

-----
best regards
Helmut
--
Sent from: http://osgeo-org.1560.x6.nabble.com/Grass-Dev-f3991897.html