[GRASS-dev] [GRASS GIS] #788: r.out.gdal and nodata

#788: r.out.gdal and nodata
-------------------------+--------------------------------------------------
Reporter: frankie | Owner: grass-dev@lists.osgeo.org
     Type: defect | Status: new
Priority: normal | Milestone: 6.4.0
Component: default | Version: 6.4.0 RCs
Keywords: | Platform: Linux
      Cpu: Unspecified |
-------------------------+--------------------------------------------------
The new version of r.out.gdal shows a regression in exporting to float
type. In 6.2 its default for nodata value was NaN, now it is an explicit
_valid_ floating point (min float?). This is not acceptable for instance
in ENVI output format, where the correct value is NaN, to avoid trashing
lookups and statistics in external programs.

--
Ticket URL: <http://trac.osgeo.org/grass/ticket/788&gt;
GRASS GIS <http://grass.osgeo.org>

#788: r.out.gdal and nodata
----------------------+-----------------------------------------------------
  Reporter: frankie | Owner: grass-dev@lists.osgeo.org
      Type: defect | Status: new
  Priority: normal | Milestone: 6.4.0
Component: Raster | Version: 6.4.0 RCs
Resolution: | Keywords: r.in.gdal, nodata
  Platform: Linux | Cpu: Unspecified
----------------------+-----------------------------------------------------
Changes (by martinl):

  * keywords: => r.in.gdal, nodata
  * component: default => Raster

--
Ticket URL: <http://trac.osgeo.org/grass/ticket/788#comment:1&gt;
GRASS GIS <http://grass.osgeo.org>

#788: r.out.gdal and nodata
----------------------+-----------------------------------------------------
  Reporter: frankie | Owner: grass-dev@lists.osgeo.org
      Type: defect | Status: new
  Priority: normal | Milestone: 6.4.0
Component: Raster | Version: 6.4.0 RCs
Resolution: | Keywords: r.in.gdal, nodata
  Platform: Linux | Cpu: Unspecified
----------------------+-----------------------------------------------------
Comment (by hamish):

This is in part a duplicate of bug #73, or at least a continuation of it.
Please read through that for background. As that bug is long and things
have evolved from then I don't mind starting afresh.

Frankie:
> The new version of r.out.gdal shows a regression in exporting
> to float type. In 6.2 its default for nodata value was NaN, now
> it is an explicit _valid_ floating point (min float?). This is
> not acceptable for instance in ENVI output format, where the
> correct value is NaN, to avoid trashing lookups and statistics
> in external programs.

Using 6.5svn with elevation.10m DCELL map from the Spearfish dataset the
default nodata value is nan (type defaults to Float64), and also I can set
nodata=nan on the command line and it works. fwiw nodata=inf does not
work.

in 6.4 it defaults to -1e+37 but setting nodata=nan explicitly on the
command line works as expected. (I'm using gdalinfo on the output geotiff
to confirm).

so this one is already "fixed", but waiting-for-backport from the 6.5
branch.

as we should try to not change default behaviour during a stable branch
maybe we should aim to get this change into the 6.4 branch before 6.4.0
... ?

one thing before that happens:

----
I notice another problem. (latest 6.5svn)
If I export a palleted raster (with cats 0-7 + NULL) as type=Byte

{{{
G65> r.out.gdal in=map out=map_LL_WGS84.tif type=Byte
creat="COMPRESS=DEFLATE,INTERLEAVE=BAND"

Exporting to GDAL data type: Byte
  100%
Input raster map contains cells with NULL-value (no-data). The value 0 was
used to represent no-data values in the input map. You can specify a
nodata
value with the nodata option.
WARNING: The default nodata value is present in rasterband <map> and would
          lead to data loss. Please specify a different nodata value with
          the nodata parameter.
ERROR: Raster export aborted.
}}}

that's all fine, but it has left an incomplete map in the filesystem. The
data is there (or at least some of it) but the georef & metadata is not.
Just before that "ERROR: Raster export aborted." it should unlink() the
output file. It doesn't know until after it starts to write if the default
nodata value will also be present in the map, so aborting and deleting the
file has to come after the fact. I assume any createopt="TFW=YES" files
etc get written after, so we don't have to worry about deleting them too.
(??)

Hamish

--
Ticket URL: <https://trac.osgeo.org/grass/ticket/788#comment:2&gt;
GRASS GIS <http://grass.osgeo.org>

#788: r.out.gdal and nodata
----------------------+-----------------------------------------------------
  Reporter: frankie | Owner: grass-dev@lists.osgeo.org
      Type: defect | Status: new
  Priority: normal | Milestone: 6.4.0
Component: Raster | Version: 6.4.0 RCs
Resolution: | Keywords: r.out.gdal, nodata
  Platform: Linux | Cpu: Unspecified
----------------------+-----------------------------------------------------
Changes (by mmetz):

  * keywords: r.in.gdal, nodata => r.out.gdal, nodata

Comment:

Replying to [comment:2 hamish]:
>
> [...]
>
> in 6.4 it defaults to -1e+37 but setting nodata=nan explicitly on the
command line works as expected. (I'm using gdalinfo on the output geotiff
to confirm).
>
>
> so this one is already "fixed", but waiting-for-backport from the 6.5
branch.
>
the r.out.gdal version in 6.5 is not the proper fix, r.out.gdal in 7.0 is
currently the safest version for raster GDAL export, waiting for backport
to 6.5, then to 6.4 if it passed all testing.
>
> ----
> I notice another problem. (latest 6.5svn)
> If I export a palleted raster (with cats 0-7 + NULL) as type=Byte
>
{{{
> G65> r.out.gdal in=map out=map_LL_WGS84.tif type=Byte
creat="COMPRESS=DEFLATE,INTERLEAVE=BAND"
>
> Exporting to GDAL data type: Byte
> 100%
> Input raster map contains cells with NULL-value (no-data). The value 0
was
> used to represent no-data values in the input map. You can specify a
nodata
> value with the nodata option.
> WARNING: The default nodata value is present in rasterband <map> and
would
> lead to data loss. Please specify a different nodata value with
> the nodata parameter.
> ERROR: Raster export aborted.
}}}
>
> that's all fine, but it has left an incomplete map in the filesystem.
The data is there (or at least some of it) but the georef & metadata is
not. Just before that "ERROR: Raster export aborted." it should unlink()
the output file.

The output map is not always in a single file, may also be in a database
(Oracle Spatial, Rasterlite in SQLite). How to handle e.g. Erdas Imagine
.img with external spill files?

> It doesn't know until after it starts to write if the default nodata
value will also be present in the map, so aborting and deleting the file
has to come after the fact. I assume any createopt="TFW=YES" files etc get
written after, so we don't have to worry about deleting them too. (??)
>
As I see it, there are three possibilities if the nodata value is present
in the rasterband to be exported:
1. Current behaviour, abort with error message, leaves a broken raster.
2. Continue with warning, leaves corrupted raster.
3. Introduce a first pass to check if the nodata value is present in the
rasterband to be exported, abort with error if necessary before actual
export. No output is written, but it will double the time needed for
export.

Anyway, the above example would successfully export in 7.0, in this case
255 would be selected as default nodata value because 0 is present in the
rasterband.

What parts of the 7.0 version of r.out.gdal qualify as bugfixes and what
parts qualify as changed default behaviour compared to 6.4?

BTW, the INTERLEAVE option has no effect if only one raster band is
exported.

Markus M

--
Ticket URL: <https://trac.osgeo.org/grass/ticket/788#comment:3&gt;
GRASS GIS <http://grass.osgeo.org>

#788: r.out.gdal and nodata
----------------------+-----------------------------------------------------
  Reporter: frankie | Owner: grass-dev@lists.osgeo.org
      Type: defect | Status: new
  Priority: normal | Milestone: 6.4.0
Component: Raster | Version: 6.4.0 RCs
Resolution: | Keywords: r.out.gdal, nodata
  Platform: Linux | Cpu: Unspecified
----------------------+-----------------------------------------------------
Comment (by hamish):

Replying to [comment:3 mmetz]:
> the r.out.gdal version in 6.5 is not the proper fix,

ok. for my needs yesterday it did the trick though, but I had to run it
twice to learn + get there.

> r.out.gdal in 7.0 is currently the safest version for raster
> GDAL export, waiting for backport to 6.5, then to 6.4 if it
> passed all testing.

could you quickly highlight the differences between them now? I've sort of
lost track.

> > it has left an incomplete map in the filesystem.
...
> The output map is not always in a single file, may also be in
> a database (Oracle Spatial, Rasterlite in SQLite). How to handle
> e.g. Erdas Imagine .img with external spill files?

don't know. perhaps we can't catch all cases. perhaps we are helped by the
fact that some of the extra files get written along with the metadata
after the job is done. We would need GDAL-dev's advice on this one I
think.

> As I see it, there are three possibilities if the nodata value
> is present in the rasterband to be exported:
> 1. Current behaviour, abort with error message, leaves a broken raster.

1.5. Current behaviour, abort with error message, but try to delete what
we know can. A possible way to do this is to write the output to
$MAPSET/.tmp/ and then move it into the destination dir once successfully
complete. (??)

> 2. Continue with warning, leaves corrupted raster.

no thanks. unknowingly trusting corrupted data is the worst kind of bug.
if it's in a script you are likely to miss the warning message in all the
noise.

> 3. Introduce a first pass to check if the nodata value is
> present in the rasterband to be exported, abort with error if
> necessary before actual export. No output is written, but it will
> double the time needed for export.

actually it'll be a bit faster than double. After the initial read from
disk most of it will be cached in memory and not need to be read again,
and also the scanning step is probably less i/o and cpu intensive than the
step that writes to disk. Personally I'm willing to trade a little time to
guarantee pristine results, but that may not fit everyone's use case.
Those really concerned with the time could use the -force flag to bypass
those checks...

> Anyway, the above example would successfully export in 7.0, in
> this case 255 would be selected as default nodata value because
> 0 is present in the rasterband.

ok.

> What parts of the 7.0 version of r.out.gdal qualify as bugfixes
> and what parts qualify as changed default behaviour compared to
> 6.4?

again, I've sort of forgotten how far things got and am not sure what's on
that list.

> BTW, the INTERLEAVE option has no effect if only one raster band
> is exported.

yeah, I noticed that after posting. I had cut&pasted it out of my template
collection (aka the man pages) and it went along for the ride.

Hamish

--
Ticket URL: <https://trac.osgeo.org/grass/ticket/788#comment:4&gt;
GRASS GIS <http://grass.osgeo.org>

#788: r.out.gdal and nodata
----------------------+-----------------------------------------------------
  Reporter: frankie | Owner: grass-dev@lists.osgeo.org
      Type: defect | Status: new
  Priority: normal | Milestone: 6.4.0
Component: Raster | Version: 6.4.0 RCs
Resolution: | Keywords: r.out.gdal, nodata
  Platform: Linux | Cpu: Unspecified
----------------------+-----------------------------------------------------
Comment (by mmetz):

Replying to [comment:4 hamish]:
> Replying to [comment:3 mmetz]:
> > the r.out.gdal version in 6.5 is not the proper fix,
>
> ok. for my needs yesterday it did the trick though, but I had to run it
twice to learn + get there.

Not good, it shouldn't be that complicated. The module should export a
correct raster map with default settings only, also if GDAL format is
given but nothing else. It gets a bit more complicated with nodata value
and/or data type set.
>
> > r.out.gdal in 7.0 is currently the safest version for raster
> > GDAL export, waiting for backport to 6.5, then to 6.4 if it
> > passed all testing.
>
> could you quickly highlight the differences between them now? I've sort
of lost track.

This is what g7 r.out.gdal does:

Before actual export:

1. Range check: abort with error if the selected GDAL data type can not
hold any of the values in the raster, e.g. using GDAL data type Byte with
a CELL raster with all values > 255 and/or < 0: Raster export would result
in complete data loss, aborting.

2. Precision test: precision loss occurs if DCELL is exported as some
integer or FLOAT32 or if Float32 is exported as some integer or if CELL is
exported as Float32 and the min max values are outside the range of what
can be represented in Float32 (see [!1]). Override with force flag.

3. Check if any given nodata value is within the range of the selected
GDAL data type. Report what would happen and abort raster export if
outside range (mismatch between metadata nodata value and the value
replacing NULLs). Effects can be drastic, try g7 r.out.gdal with type=Byte
and nodata=-1 and nodata=-9999. Can't override.

4. Get default nodata value if none specified: for any floating point this
is NaN, for integer types, it figures out a default type. Based on
discussion in [!1].

During actual export:

5. Real nodata value check: abort if the selected nodata value is present
in the part of the raster map to be exported. Override with force flag.

6. Real range check: abort if any values in the part of the raster map to
be exported are outside the selected GDAL data type -> data loss, near
unpredictable results, can't override.

The checks during the actual export could be moved to a separate pass over
all raster bands to be exported, I think this is the safest solution, then
r.out.gdal does not have to bother with cleaning up and can leave all
format-specific error handling to GDAL. Looking at the ever increasing
number of GDAL-supported formats, the introduction of format specific code
to r.out.gdal would open a box of Pandora, and IMHO the purpose of
r.out.gdal is that all that format-specific stuff can be left to the GDAL
lib.

What is missing is a region check. At the very least, r.out.gdal should
make sure that at least some parts of the raster maps to be exported fall
within the current region, if not, abort with error: raster map outside
current region, nothing to export. I wouldn't make it too complicated, so
no resolution checks and also export if parts of the current region are
not covered by the raster map to be exported.

[!1] http://lists.osgeo.org/pipermail/grass-dev/2009-April/043464.html and
following

Markus M

--
Ticket URL: <https://trac.osgeo.org/grass/ticket/788#comment:5&gt;
GRASS GIS <http://grass.osgeo.org>

#788: r.out.gdal and nodata
----------------------+-----------------------------------------------------
  Reporter: frankie | Owner: grass-dev@lists.osgeo.org
      Type: defect | Status: new
  Priority: normal | Milestone: 6.4.0
Component: Raster | Version: 6.4.0 RCs
Resolution: | Keywords: r.out.gdal, nodata
  Platform: Linux | Cpu: Unspecified
----------------------+-----------------------------------------------------
Comment (by hamish):

Ok sounds good. Lets go ahead and backport what's in the gr7 version to
gr6.5. No point in widely testing dead-end code in 6.5...

Replying to [comment:5 mmetz]:
> The checks during the actual export could be moved to a
> separate pass over all raster bands to be exported, I think
> this is the safest solution, then r.out.gdal does not have to
> bother with cleaning up and can leave all format-specific error
> handling to GDAL.

sounds ok to me/I could live with that. anyone else have a POV?

> What is missing is a region check. At the very least, r.out.gdal
> should make sure that at least some parts of the raster maps to
> be exported fall within the current region, if not, abort with
> error: raster map outside current region, nothing to export.

hmmm. I would not do that, it is straying away from checking for errors
into checking for intention. I would say that's a good idea but change it
to a G_warning() or G_message() instead of G_fatal_error() and just write
a map full of NULLs regardless. No other GRASS raster module refuses to go
on if you have the region (ie '''WIND'''ow) set beyond its bounds.

e.g. I have a script which tiles a raster into a series of small geotiffs.
say I had a region with a full raster map in the top left and another in
the bottom right with overlapping the corners in the middle and empty
space in the opposite corners. I'd want those all-NULL tiles instead of a
404 error.

{{{
|-----------------|
| |
| +-----+ |
| | | |
| | +-+-----+ |
| +---|-+ | |
| | | |
| +-------+ |
|-----------------|
}}}

Hamish

--
Ticket URL: <https://trac.osgeo.org/grass/ticket/788#comment:6&gt;
GRASS GIS <http://grass.osgeo.org>

trac server gives problems, doesn't send new comment to list, see

https://trac.osgeo.org/grass/ticket/788#comment:7

GRASS GIS wrote:

#788: r.out.gdal and nodata
----------------------+-----------------------------------------------------
  Reporter: frankie | Owner: grass-dev@lists.osgeo.org
      Type: defect | Status: new Priority: normal | Milestone: 6.4.0 Component: Raster | Version: 6.4.0 RCs Resolution: | Keywords: r.out.gdal, nodata Platform: Linux | Cpu: Unspecified ----------------------+-----------------------------------------------------
Comment (by hamish):

Ok sounds good. Lets go ahead and backport what's in the gr7 version to
gr6.5. No point in widely testing dead-end code in 6.5...

Replying to [comment:5 mmetz]:
> The checks during the actual export could be moved to a
> separate pass over all raster bands to be exported, I think
> this is the safest solution, then r.out.gdal does not have to
> bother with cleaning up and can leave all format-specific error
> handling to GDAL.

sounds ok to me/I could live with that. anyone else have a POV?

> What is missing is a region check. At the very least, r.out.gdal
> should make sure that at least some parts of the raster maps to
> be exported fall within the current region, if not, abort with
> error: raster map outside current region, nothing to export.

hmmm. I would not do that, it is straying away from checking for errors
into checking for intention. I would say that's a good idea but change it
to a G_warning() or G_message() instead of G_fatal_error() and just write
a map full of NULLs regardless. No other GRASS raster module refuses to go
on if you have the region (ie '''WIND'''ow) set beyond its bounds.

e.g. I have a script which tiles a raster into a series of small geotiffs.
say I had a region with a full raster map in the top left and another in
the bottom right with overlapping the corners in the middle and empty
space in the opposite corners. I'd want those all-NULL tiles instead of a
404 error.

{{{
|-----------------|
| |
| +-----+ |
| | | |
| | +-+-----+ |
| +---|-+ | |
| | | |
| +-------+ |
|-----------------|
}}}

Hamish

#788: r.out.gdal and nodata
----------------------+-----------------------------------------------------
  Reporter: frankie | Owner: grass-dev@lists.osgeo.org
      Type: defect | Status: new
  Priority: normal | Milestone: 6.4.0
Component: Raster | Version: 6.4.0 RCs
Resolution: | Keywords: r.out.gdal, nodata
  Platform: Linux | Cpu: Unspecified
----------------------+-----------------------------------------------------
Comment (by mmetz):

Checks are now performed before actual export, in case of aborted export,
nothing was exported and no files need to be removed. Done in trunk r40119
and devbr6 r40120.

I noticed that the relbr64 manual is quite different, a lot of IMO
clarifying information has been added to the updated manuals in devbr6 and
trunk. Backport the manual to relbr64?

--
Ticket URL: <https://trac.osgeo.org/grass/ticket/788#comment:8&gt;
GRASS GIS <http://grass.osgeo.org>

#788: r.out.gdal and nodata
----------------------+-----------------------------------------------------
  Reporter: frankie | Owner: grass-dev@lists.osgeo.org
      Type: defect | Status: new
  Priority: normal | Milestone: 6.4.1
Component: Raster | Version: 6.4.0 RCs
Resolution: | Keywords: r.out.gdal, nodata
  Platform: Linux | Cpu: Unspecified
----------------------+-----------------------------------------------------
Changes (by neteler):

  * milestone: 6.4.0 => 6.4.1

--
Ticket URL: <http://trac.osgeo.org/grass/ticket/788#comment:9&gt;
GRASS GIS <http://grass.osgeo.org>

#788: r.out.gdal and nodata
----------------------+-----------------------------------------------------
  Reporter: frankie | Owner: grass-dev@lists.osgeo.org
      Type: defect | Status: new
  Priority: normal | Milestone: 6.4.1
Component: Raster | Version: 6.4.0 RCs
Resolution: | Keywords: r.out.gdal, nodata
  Platform: Linux | Cpu: Unspecified
----------------------+-----------------------------------------------------
Comment (by neteler):

Attached the 6.4.svn backport of the r.out.gdal fixes from 6.5. This fixes
the nodata problem when exporting (tested with UInt16). The present logic
in 6.4.svn is unusable.

--
Ticket URL: <http://trac.osgeo.org/grass/ticket/788#comment:10&gt;
GRASS GIS <http://grass.osgeo.org>

#788: r.out.gdal and nodata
--------------------------------+-------------------------------------------
Reporter: frankie | Owner: grass-dev@…
     Type: defect | Status: new
Priority: normal | Milestone: 6.4.1
Component: Raster | Version: 6.4.0 RCs
Keywords: r.out.gdal, nodata | Platform: Linux
      Cpu: Unspecified |
--------------------------------+-------------------------------------------

Comment(by hamish):

has this been backported to 6.4.0 already or do 6.4 and 6.5 still differ?

--
Ticket URL: <http://trac.osgeo.org/grass/ticket/788#comment:11&gt;
GRASS GIS <http://grass.osgeo.org>

#788: r.out.gdal and nodata
--------------------------------+-------------------------------------------
Reporter: frankie | Owner: grass-dev@…
     Type: defect | Status: new
Priority: normal | Milestone: 6.4.1
Component: Raster | Version: 6.4.0 RCs
Keywords: r.out.gdal, nodata | Platform: Linux
      Cpu: Unspecified |
--------------------------------+-------------------------------------------

Comment(by mmetz):

Replying to [comment:11 hamish]:
> has this been backported to 6.4.0 already or do 6.4 and 6.5 still
differ?

Backported some months ago (05/05/10) in r42117, all branches including
7.0 behave identical.

Markus M

--
Ticket URL: <https://trac.osgeo.org/grass/ticket/788#comment:12&gt;
GRASS GIS <http://grass.osgeo.org>

#788: r.out.gdal and nodata
----------------------+-----------------------------------------------------
  Reporter: frankie | Owner: grass-dev@…
      Type: defect | Status: closed
  Priority: normal | Milestone: 6.4.1
Component: Raster | Version: 6.4.0 RCs
Resolution: fixed | Keywords: r.out.gdal, nodata
  Platform: Linux | Cpu: Unspecified
----------------------+-----------------------------------------------------
Changes (by hamish):

  * status: new => closed
  * resolution: => fixed

Comment:

thanks; in that case closing the bug.

--
Ticket URL: <https://trac.osgeo.org/grass/ticket/788#comment:13&gt;
GRASS GIS <http://grass.osgeo.org>