[GRASS-dev] Rast_open_update?

Whenever I change a couple of cells in an existing raster map, I have to create a new raster map and patch the old and new maps, which can take long because r.patch has to read and write the entire map.

There are Rast_open_old/new, but no Rast_open_update and AFAIK there are no raster modules that directly update existing raster maps. I don’t know why. I thought it would be great if there was Rast_open_update so we can update existing raster maps without creating a temp map and patching it.

Hi Huidae,
FWIK, the reason why updating raster maps is currently not supported
in GRASS at the module level is the storage format. Usually raster
maps are stored using zip compressed rows. Rows are written in serial,
hence if you modify a single row, the size of this row may be
different after compression and does not fit exactly in the position
of the old row in the existing file. From my knowledge there is
currently no way to append rows or mark rows as invalid in existing
raster maps.
However, there is the possibility to store raster maps using
uncompressed rows, but i don't recall any raster library function to
update such maps.

Best regards
Soeren

2014-06-26 15:05 GMT+02:00 Huidae Cho <grass4u@gmail.com>:

Whenever I change a couple of cells in an existing raster map, I have to
create a new raster map and patch the old and new maps, which can take long
because r.patch has to read and write the entire map.

There are Rast_open_old/new, but no Rast_open_update and AFAIK there are no
raster modules that directly update existing raster maps. I don't know why.
I thought it would be great if there was Rast_open_update so we can update
existing raster maps without creating a temp map and patching it.

_______________________________________________
grass-dev mailing list
grass-dev@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/grass-dev

Hi Soeren,

Ah! You’re right.

Cell files have pointers to rows in the header, so maybe, we could implement functions that can copy multiple rows at a time without uncompressing/compressing them row by row even if MASK may not be applied properly. This is not a true update, but at least copy can be more efficient than now.

Regards,
Huidae

···

On Thu, Jun 26, 2014 at 9:30 AM, Sören Gebbert <soerengebbert@googlemail.com> wrote:

Hi Huidae,
FWIK, the reason why updating raster maps is currently not supported
in GRASS at the module level is the storage format. Usually raster
maps are stored using zip compressed rows. Rows are written in serial,
hence if you modify a single row, the size of this row may be
different after compression and does not fit exactly in the position
of the old row in the existing file. From my knowledge there is
currently no way to append rows or mark rows as invalid in existing
raster maps.
However, there is the possibility to store raster maps using
uncompressed rows, but i don’t recall any raster library function to
update such maps.

Best regards
Soeren

2014-06-26 15:05 GMT+02:00 Huidae Cho <grass4u@gmail.com>:

Whenever I change a couple of cells in an existing raster map, I have to
create a new raster map and patch the old and new maps, which can take long
because r.patch has to read and write the entire map.

There are Rast_open_old/new, but no Rast_open_update and AFAIK there are no
raster modules that directly update existing raster maps. I don’t know why.
I thought it would be great if there was Rast_open_update so we can update
existing raster maps without creating a temp map and patching it.


grass-dev mailing list
grass-dev@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/grass-dev

Sören Gebbert wrote:

Huidae Cho wrote:

Cell files have pointers to rows in the header, so maybe, we could
implement functions that can copy multiple rows at a time without
uncompressing/compressing them row by row even if MASK may not be applied
properly. This is not a true update, but at least copy can be more
efficient than now.

In theory, you could just append the modified rows and update the
pointers, leaving the original rows in place.

However, there are other issues with updates. E.g.

* If the rows which were modified contained the minimum or maximum
values within the map, we'd need to re-scan the entire map to
determine the correct range for the updated map (normally, the range
is updated as each row is written).

* If the update affects the range, the colour table would no longer be
correct (many colour tables are scaled to the range of the data),
and may not cover the entire range of the data.

* What should happen to the history? If the map has a histogram,
should it be updated (which requires re-scanning the map) or deleted?

In-place modification also creates issues for processes which may be
reading the map. Although there should be no more than one session for
each mapset, it's allowed to read maps from other mapsets. While there
may be race conditions regarding metadata, the actual map data is
updated more or less atomically (the cell/fcell/null files for new
maps are written to temporary files then renamed into place).

It would also be problematic for GDAL-linked maps (r.external[.out]).
Many formats cannot realistically support in-place update (e.g. most
compressed formats). Even if the format allows it, I don't know
whether the GDAL API does.

It would be less problematic (although still not entirely trivial) to
add extensions to allow r.patch to be optimised, e.g. a
Rast_copy_row() function which could copy the raw compressed data for
a row from one map to another.

--
Glynn Clements <glynn@gclements.plus.com>

We can edit vector maps and does it mean that vector maps don’t use a compression algorithm and have the issues you mentioned about raster maps?

On Jun 29, 2014 7:13 AM, “Glynn Clements” <glynn@gclements.plus.com> wrote:

Sören Gebbert wrote:

Huidae Cho wrote:

Cell files have pointers to rows in the header, so maybe, we could
implement functions that can copy multiple rows at a time without
uncompressing/compressing them row by row even if MASK may not be applied
properly. This is not a true update, but at least copy can be more
efficient than now.

In theory, you could just append the modified rows and update the
pointers, leaving the original rows in place.

However, there are other issues with updates. E.g.

  • If the rows which were modified contained the minimum or maximum
    values within the map, we’d need to re-scan the entire map to
    determine the correct range for the updated map (normally, the range
    is updated as each row is written).

  • If the update affects the range, the colour table would no longer be
    correct (many colour tables are scaled to the range of the data),
    and may not cover the entire range of the data.

  • What should happen to the history? If the map has a histogram,
    should it be updated (which requires re-scanning the map) or deleted?

In-place modification also creates issues for processes which may be
reading the map. Although there should be no more than one session for
each mapset, it’s allowed to read maps from other mapsets. While there
may be race conditions regarding metadata, the actual map data is
updated more or less atomically (the cell/fcell/null files for new
maps are written to temporary files then renamed into place).

It would also be problematic for GDAL-linked maps (r.external[.out]).
Many formats cannot realistically support in-place update (e.g. most
compressed formats). Even if the format allows it, I don’t know
whether the GDAL API does.

It would be less problematic (although still not entirely trivial) to
add extensions to allow r.patch to be optimised, e.g. a
Rast_copy_row() function which could copy the raw compressed data for
a row from one map to another.


Glynn Clements <glynn@gclements.plus.com>

Huidae Cho wrote:

We can edit vector maps and does it mean that vector maps don't use a
compression algorithm and have the issues you mentioned about raster maps?

I don't know much about vector maps, other than that they have almost
nothing in common with raster maps.

--
Glynn Clements <glynn@gclements.plus.com>