[GRASS-dev] Significant r.in.gdal speed-up: predefined 300 MiB as default cache size

Hi,

I have made a modification to r.in.gdal for a (significant) speedup
(both in trunk and relbranch70).

A nice test case is the European 25m elevation model which is a 23GB
GeoTIFF file of 4.8 billion raster cells:

gdalinfo /geodata/eudem_dem_3035_europe.tif
Driver: GTiff/GeoTIFF
Files: /geodata/eudem_dem_3035_europe.tif
Size is 240000, 200000
Coordinate System is:
PROJCS["ETRS89 / LAEA Europe",
...
    UNIT["metre",1,
        AUTHORITY["EPSG","9001"]],
    AUTHORITY["EPSG","3035"]]
...
Pixel Size = (25.000000000000000,-25.000000000000000)
...

Previously:
# using default GDAL cache (~40MB, which is tiny also for gdalwarp etc!)
GRASS 7.0.0svn (eu_laea):~ > time -p r.in.gdal
/geodata/eudem_dem_3035_europe.tif \
  output=eudem_dem_3035_europe
100%
Raster map <eudem_dem_3035_europe> created.
r.in.gdal complete.
real 279901.23
user 267876.52
sys 1456.18
--> 77h

New:
# I have now defined 300MB as default cache size (i.e. memory=300)
GRASS 7.0.0svn (eu_laea):~ > time -p r.in.gdal
/geodata/eudem_dem_3035_europe.tif \
  output=eu_dem_25m
100%
Raster map <eu_dem_25m> created.
r.in.gdal complete.
real 5381.95
user 5091.27
sys 31.03
--> 1:30h

The user can set different cache sizes via the memory option as
before. Most probably didn't know about this huge difference, that's
why I added 300MB as default cache size (rather than keeping the
original tiny GDAL setting [1]).

If I am not wrong, it took only 2% of the previous time for importing
this big file.
Perhaps the GDAL project should reconsider their default cache size as well :slight_smile:

enjoy,
Markus

[1] http://trac.osgeo.org/gdal/wiki/ConfigOptions#GDAL_CACHEMAX

WoW ! Thanks Markus !

On 30 July 2014 21:42, Markus Neteler <neteler@osgeo.org> wrote:

Hi,

I have made a modification to r.in.gdal for a (significant) speedup
(both in trunk and relbranch70).

A nice test case is the European 25m elevation model which is a 23GB
GeoTIFF file of 4.8 billion raster cells:

gdalinfo /geodata/eudem_dem_3035_europe.tif
Driver: GTiff/GeoTIFF
Files: /geodata/eudem_dem_3035_europe.tif
Size is 240000, 200000
Coordinate System is:
PROJCS["ETRS89 / LAEA Europe",
...
    UNIT["metre",1,
        AUTHORITY["EPSG","9001"]],
    AUTHORITY["EPSG","3035"]]
...
Pixel Size = (25.000000000000000,-25.000000000000000)
...

Previously:
# using default GDAL cache (~40MB, which is tiny also for gdalwarp etc!)
GRASS 7.0.0svn (eu_laea):~ > time -p r.in.gdal
/geodata/eudem_dem_3035_europe.tif \
  output=eudem_dem_3035_europe
100%
Raster map <eudem_dem_3035_europe> created.
r.in.gdal complete.
real 279901.23
user 267876.52
sys 1456.18
--> 77h

New:
# I have now defined 300MB as default cache size (i.e. memory=300)
GRASS 7.0.0svn (eu_laea):~ > time -p r.in.gdal
/geodata/eudem_dem_3035_europe.tif \
  output=eu_dem_25m
100%
Raster map <eu_dem_25m> created.
r.in.gdal complete.
real 5381.95
user 5091.27
sys 31.03
--> 1:30h

The user can set different cache sizes via the memory option as
before. Most probably didn't know about this huge difference, that's
why I added 300MB as default cache size (rather than keeping the
original tiny GDAL setting [1]).

If I am not wrong, it took only 2% of the previous time for importing
this big file.
Perhaps the GDAL project should reconsider their default cache size as well :slight_smile:

enjoy,
Markus

[1] http://trac.osgeo.org/gdal/wiki/ConfigOptions#GDAL_CACHEMAX
_______________________________________________
grass-dev mailing list
grass-dev@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/grass-dev

--
----