[GRASS-dev] G3d library code review and raster lib tile approach

Hi all,
i have created a wiki page about the G3d library and related modules
code review:

http://trac.osgeo.org/grass/wiki/Grass7/G3dLib

While surfing the wiki i found a section about the new rater library
using a tile based approach instead of row based:

http://trac.osgeo.org/grass/wiki/Grass7/RasterLib#Corerasterformat

I have added some comments to the ideas with my knowledge of the G3d
library gained while code reviewing.

I would like to reactivate this topic and have some questions:
* Its there a time frame when the new raster tile approach will be implemented?
* Are there plans to merge the G3d library and the raster library?
* Can the G3d library be used as backend for tile storage in the
raster library (setting the z dimension to one)?

Best regards
Soeren

Soeren Gebbert wrote:

* Its there a time frame when the new raster tile approach will be
implemented?

No. It hasn't even been decided that it will be implemented; it's just
a basis for discussion.

* Are there plans to merge the G3d library and the raster library?

It has been "suggested"; there isn't any consensus, and I'm not sure
that there's even one person who is convinced that it's the right
solution.

* Can the G3d library be used as backend for tile storage in the
raster library (setting the z dimension to one)?

Potentially.

Many of the ideas came up before I implemented r.external and
r.external.out. That lead to an alternate possibility, that GRASS
itself wouldn't actually have /any/ particular raster format but just
deal with whatever GDAL supported.

The would require that GDAL had "native" GRASS raster I/O code (i.e.
not depending upon libgis/libraster). The main issue here is that the
code would probably need to be written from scratch due to licensing
issues: GRASS is GPL, GDAL is LGPL, and trying to track down everyone
who might have a copyright interest in the raster I/O code wouldn't be
practical.

--
Glynn Clements <glynn@gclements.plus.com>

Hi Glynn,
thanks for your answer.

* Can the G3d library be used as backend for tile storage in the
raster library (setting the z dimension to one)?

Potentially.

Just an implementation example which was in my mind using g3d as raster backend:

I think the Rast_put_row() and Rast_get_row() function can be
efficiently mapped using the g3d X-direction tile cache approach. The
X-direction cache can be automatically set when the mapped
Rast_open_*() functions are called.

Consider a tile size of 64x64 pixel. An allocated raster buffer will
be filled with

for col in cols:
buffer[col] = G3d_getValue(map, col, row, 0)

or its content will be written with

for col in cols:
G3d_putValue(map, col, row, 0, buff[col]).

In case the tile cache size is big enough reading/writing the first
raster row will result in reading/creating all tiles in a row in
memory caching 64 raster rows. In case of g3d raster maps with 10000
columns the cache size must be 157 tiles == 5,2 MB. In case of 100000
columns 52 MB. It would be interesting to compare the performance of
this approach with the current raster approach. IMHO the g3d approach
should not be much slower.

Additionally Rast_put_value() and Rast_get_value() can easily be
implemented allowing arbitrary single cell value access.

The drawback is that CELL values are not supported and must be
currently mapped using DCELL values ... well at least the g3d library
has the ability to reduce the mantissa to 32Bit to minimize the disk
space usage for compressed tiles.

IMHO the g3d tile approach should be considered as raster backend and
will result in minor to no changes in existing raster modules.

Many of the ideas came up before I implemented r.external and
r.external.out. That lead to an alternate possibility, that GRASS
itself wouldn't actually have /any/ particular raster format but just
deal with whatever GDAL supported.

Using a pure gdal approach leads me to many questions:

What would be the default format for reading and writing in a grass mapset?
At which location will the files and the metadata be stored?
Are there modules which need to be modified to use this approach?
How is the performance of this approach compared to the native raster
I/O in GRASS?
Should such a strong external dependency really be considered?

The would require that GDAL had "native" GRASS raster I/O code (i.e.
not depending upon libgis/libraster). The main issue here is that the
code would probably need to be written from scratch due to licensing
issues: GRASS is GPL, GDAL is LGPL, and trying to track down everyone
who might have a copyright interest in the raster I/O code wouldn't be
practical.

Will the benefit of this approach justify this huge effort of writing
the I/O code from scratch?

Best regards
Soeren

Soeren Gebbert wrote:

> Many of the ideas came up before I implemented r.external and
> r.external.out. That lead to an alternate possibility, that GRASS
> itself wouldn't actually have /any/ particular raster format but just
> deal with whatever GDAL supported.

Using a pure gdal approach leads me to many questions:

What would be the default format for reading and writing in a grass mapset?

Probably GRASS' native format, for compatibility.

At which location will the files and the metadata be stored?

The metadata is stored in the mapset; only the actual raster data is
read and written via GDAL; linked maps still require metadata files.
The actual data can be stored wherever you want, but I'd suggest a
"gdal" subdirectory in the mapset for non-GRASS (TIFF etc) files.

Are there modules which need to be modified to use this approach?

Only those which perform low-level access, e.g. r.null.

How is the performance of this approach compared to the native raster
I/O in GRASS?

It depends entirely upon the format and the usage. E.g. compressed
formats such as PNG are likely to be inefficient if you try to skip
rows.

Should such a strong external dependency really be considered?

GDAL is already almost mandatory.

> The would require that GDAL had "native" GRASS raster I/O code (i.e.
> not depending upon libgis/libraster). The main issue here is that the
> code would probably need to be written from scratch due to licensing
> issues: GRASS is GPL, GDAL is LGPL, and trying to track down everyone
> who might have a copyright interest in the raster I/O code wouldn't be
> practical.

Will the benefit of this approach justify this huge effort of writing
the I/O code from scratch?

It's not a huge effort. Most of it is accounted for by four files:
open.c, get_row.c, put_row.c and close.c. And it's not even all of
those files; the higher-level stuff (e.g. resampling) would remain
within libraster.

It's probably less effort than trying to add yet another raster format
within the existing raster code.

--
Glynn Clements <glynn@gclements.plus.com>

Glynn wrote:

The metadata is stored in the mapset; only the actual
raster data is read and written via GDAL; linked maps still
require metadata files.

(much as shapefiles do, or worldfiles+tiffs do; not that I like
having so many when it comes to distributing them, but just to
note that similar multifile formats have been done before)

> Should such a strong external dependency really be
> considered?

GDAL is already almost mandatory.

the bit I worry about is putting so much of the core functionality
onto a 3rd party sole-supplier over which we have limited control.
probably the main problem from a practical sense is diff't users
mixing and matching diff't versions of gdal with diff't versions
of grass and expecting that all still acts in a consistent way.
saying "GRASS 7.0.3 requires GDAL 2.0.4" kind of defeats the
purpose, although I guess depending on gdal >= 2.0.0 would be
reasonable enough.

2c,
Hamish

On Sun, Jul 3, 2011 at 1:56 AM, Hamish <hamish_b@yahoo.com> wrote:

Glynn wrote:
> On Fri, Jul 1, 2011 at 11:45 PM, Soeren Gebbert <soerengebbert@googlemail.com> wrote:
> > Should such a strong external dependency really be
> > considered?
>
> GDAL is already almost mandatory.

the bit I worry about is putting so much of the core functionality
onto a 3rd party sole-supplier over which we have limited control.

... especially in the light of this change:
http://fwarmerdam.blogspot.com/2011/06/joining-google.html

http://www.spatiallyadjusted.com/2011/06/21/frank-warmerdam-goes-to-google-google-unimpressed-with-our-niche-awesomeness/

Markus

Soeren wrote:

> > > Should such a strong external dependency really be
> > > considered?

Glynn:

> > GDAL is already almost mandatory.

Hamish:

> the bit I worry about is putting so much of the core
> functionality onto a 3rd party sole-supplier over which
> we have limited control.

Markus:

... especially in the light of this change:
http://fwarmerdam.blogspot.com/2011/06/joining-google.html

hmmm.. interesting. well I am not too worried about the future
of GDAL, it has a good community around it. What I wonder though
is who will take the reigns of PROJ.4? it's already a bit under-
resourced and if there's less FW to go around, I suppose his
energy would veer towards GDAL first.

Hamish

Hi Glynn,
i hope i don't bother you to much about GDAL-GRASS raster implementation. :slight_smile:

How is the performance of this approach compared to the native raster
I/O in GRASS?

It depends entirely upon the format and the usage. E.g. compressed
formats such as PNG are likely to be inefficient if you try to skip
rows.

Pleas excuse my little knowledge about GDAL but an important point to
change the grass raster row approach was to enable random access for
single values using tiles? Will the GDAL approach provide such
capabilities? What kind of new function will be available with GDAL?

Should such a strong external dependency really be considered?

GDAL is already almost mandatory.

I know little about GDAl community but would the GDAL community accept
that GRASS GIS core functionality will be part of GDAL? So GRASS
developer/documenter who need to fix bugs or try to improve GRASS
raster library need to have write access to the GDAL repository? Will
GRASS depend on GDAL release cycles in case of stable version?

Best regards
Soeren

On Tue, Jul 5, 2011 at 12:03 PM, Soeren Gebbert
<soerengebbert@googlemail.com> wrote:

Glynn wrote:

GDAL is already almost mandatory.

I know little about GDAl community but would the GDAL community accept
that GRASS GIS core functionality will be part of GDAL?

Good question. See also the (un)related
http://trac.osgeo.org/gdal/wiki/rfc34_license_policy

Markus

Soeren Gebbert wrote:

>> How is the performance of this approach compared to the native raster
>> I/O in GRASS?
>
> It depends entirely upon the format and the usage. E.g. compressed
> formats such as PNG are likely to be inefficient if you try to skip
> rows.

Pleas excuse my little knowledge about GDAL but an important point to
change the grass raster row approach was to enable random access for
single values using tiles? Will the GDAL approach provide such
capabilities? What kind of new function will be available with GDAL?

GRASS already supports the use of GDAL via r.external[.out]. The only
change would be to separate the existing raster I/O code from GRASS.

This potentially makes it easier to add/change raster formats, as we
wouldn't need to support all of the legacy GRASS raster formats in
addition to any new format; access to the old formats would be via
GDAL.

But GRASS isn't going to "change" to a tiled format. There might be
the possibility of using a tiled format as an alternative to a
row-based format, but the row-by-row API will remain the norm, and
shouldn't suffer a perfomance hit for the sake of less common cases
(e.g. random access).

>> Should such a strong external dependency really be considered?
>
> GDAL is already almost mandatory.

I know little about GDAl community but would the GDAL community accept
that GRASS GIS core functionality will be part of GDAL?

The GRASS raster format is just another raster format. The fact that
it was specifically designed for GIS isn't really relevant.

So GRASS
developer/documenter who need to fix bugs or try to improve GRASS
raster library need to have write access to the GDAL repository? Will
GRASS depend on GDAL release cycles in case of stable version?

It depends. One option is for the GRASS raster I/O to be made into a
self-contained library, analogous to libpng or libjpeg. The GDAL
driver would then interface to this library.

--
Glynn Clements <glynn@gclements.plus.com>