[GRASS-dev] r.univar: 'nan' results

Hi,

Using current 6.3 CVS I had a strange issue with r.univar. For one of
my rasters most statistics are "nan" in r.univar, which I don't
understand why and while the r.univar.sh output is OK for the same
raster. Both outputs you'll find below. The raster is an product of
v.to.rast use=dir.

$ r.univar cieki10_2zl_dir
100%
total null and non-null cells: 378972
total null cells: 374849

Of the non-null cells:
----------------------
n: 4123
minimum: 0
maximum: 358.264
range: 358.264
mean: nan
mean of absolute values: nan
standard deviation: nan
variance: nan
variation coefficient: nan %
sum: nan

$ r.univar.sh cieki10_2zl_dir
WARNING: This module is superseded and will be removed in future
versions of GRASS. Use the much faster r.univar instead.

Calculation for map cieki10_2zl_dir (ignoring NULL cells)...
Reading raster map...
r.stats: 100%
Calculating statistics...

Number of cells (excluding NULL cells): 4123
Minimum: 0
Maximum: nan
Range: 0
Arithmetic mean: 160.434
Arithmetic mean of absolute values: 160.434
Variance: 11639.4
Standard deviation: 107.886
Variation coefficient: 67.2463 %

What's wrong?

I couldn't reproduce this in spearfish. Below's is the r.info output.
If my raster is needed for testing I can send it.

datatype=DCELL

min=0.000000
max=358.264295

nsres=10
ewres=10

north=5684100
south=5676270
east=601480
west=596640

Data Source:
   Vector Map: cieki10_2zl in mapset melio2
   Original Scale from Vector Map: 1:1
Data Description:
   generated by v.to.rast
Comments:
   v.to.rast input="cieki10_2zl" output="cieki10_2zl_dir" use="dir" lay\
   er=1 value=1 rows=4096

Maciek

I've just noticed the r.univar.sh output is also screwed OK - the max
value is nan here! What the devil?!

$ r.univar.sh cieki10_2zl_dir
WARNING: This module is superseded and will be removed in future
versions of GRASS. Use the much faster r.univar instead.

Calculation for map cieki10_2zl_dir (ignoring NULL cells)...
Reading raster map...
r.stats: 100%
Calculating statistics...

Number of cells (excluding NULL cells): 4123
Minimum: 0
Maximum: nan

            ^
         ???!!!

Maciek

Hi Maciek,

your raster map contain a few cells of 'nan' value. It seems there is
problem with v.to.rast module(?).

Best, Martin

PS: Could you send me also the vector map 'cieki10_2zl'?

2006/9/15, Maciej Sieczka <tutey@o2.pl>:

Martin

I'm sending you the location with the "univar" mapset containing *only*
the cieki10_2zl_dir raster in question.

Best,
Maciek

--
Martin Landa <landa.martin@gmail.com> * http://gama.fsv.cvut.cz/~landa *

Maciej Sieczka wrote:

I've just noticed the r.univar.sh output is also screwed OK - the max
value is nan here! What the devil?!

> $ r.univar.sh cieki10_2zl_dir
> WARNING: This module is superseded and will be removed in future
> versions of GRASS. Use the much faster r.univar instead.
>
> Calculation for map cieki10_2zl_dir (ignoring NULL cells)...
> Reading raster map...
> r.stats: 100%
> Calculating statistics...
>
> Number of cells (excluding NULL cells): 4123
> Minimum: 0
> Maximum: nan
            ^
         ???!!!

(nan = not a number, GRASS uses NULL for the same idea, but NULL is not
the same as nan.)

If there are nan in the cell data, the r.univar calculations can't cope.
The question is: how did nan cells get into the input raster map?

There is a possibility that there is a bug in r.univar..
Try using r.mapcalc to isolate the nans? What does "r.info -r" say?

Hamish

Hi,

(nan = not a number, GRASS uses NULL for the same idea, but NULL is not
the same as nan.)

agreed, for me, it seems like a bug (v.to.rast?), in any case a 'nan'
value is an *invalid* cell value.

If there are nan in the cell data, the r.univar calculations can't cope.
The question is: how did nan cells get into the input raster map?

v.to.rast ?

There is a possibility that there is a bug in r.univar..
Try using r.mapcalc to isolate the nans? What does "r.info -r" say?

min=0.000000
max=358.264295

Martin

--
Martin Landa <landa.martin@gmail.com> * http://gama.fsv.cvut.cz/~landa *

Martin Landa wrote:

Hamish wrote:

(nan = not a number, GRASS uses NULL for the same idea, but NULL is not
the same as nan.)

agreed, for me, it seems like a bug (v.to.rast?), in any case a 'nan'
value is an *invalid* cell value.

If there are nan in the cell data, the r.univar calculations can't cope.
The question is: how did nan cells get into the input raster map?

v.to.rast ?

There is a possibility that there is a bug in r.univar..
Try using r.mapcalc to isolate the nans? What does "r.info -r" say?

min=0.000000
max=358.264295

It seems like a bug in v.to.rast. Take the location I attached, and do:

$ g.region vect=cieki10_2zl res=10 -a
$ v.to.rast input=cieki10_2zl output=cieki10_2zl_dir use=dir

There are several nan cells indeed:

$ r.stats -1g cieki10_2zl_dir | grep -i nan
r.stats: 600265 5680715 nan
600335 5680575 nan
600395 5680435 nan
599735 5680345 nan
598695 5679865 nan
600245 5679475 nan
599125 5679245 nan
599675 5678955 nan
598915 5678935 nan
599815 5678765 nan
598785 5678655 nan
600345 5678335 nan
600545 5678015 nan

Maciek

(attachments)

univar2.tar.bz2 (72.6 KB)

Hi,

in do_lines.c, the function v2angle ()

/* cos of the angle between two vectors is (a . b)/|a||b| */
double
v2angle(double v1[2], double v2[2], double mag1, double mag2)
{
    double costheta = (v1[0] * v2[0] + v1[1] * v2[1])/(mag1 * mag2);

    return (acos(costheta));
}

If mag2 is 0 you get 'nan' value. Not sure, just testing...

if (mag2 == 0)
-> cat value NULL

?

Martin

2006/9/17, Maciej Sieczka <tutey@o2.pl>:

Martin Landa wrote:

> Hamish wrote:

>> (nan = not a number, GRASS uses NULL for the same idea, but NULL is not
>> the same as nan.)

> agreed, for me, it seems like a bug (v.to.rast?), in any case a 'nan'
> value is an *invalid* cell value.

>> If there are nan in the cell data, the r.univar calculations can't cope.
>> The question is: how did nan cells get into the input raster map?

> v.to.rast ?

>> There is a possibility that there is a bug in r.univar..
>> Try using r.mapcalc to isolate the nans? What does "r.info -r" say?

> min=0.000000
> max=358.264295

It seems like a bug in v.to.rast. Take the location I attached, and do:

$ g.region vect=cieki10_2zl res=10 -a
$ v.to.rast input=cieki10_2zl output=cieki10_2zl_dir use=dir

There are several nan cells indeed:

$ r.stats -1g cieki10_2zl_dir | grep -i nan
r.stats: 600265 5680715 nan
600335 5680575 nan
600395 5680435 nan
599735 5680345 nan
598695 5679865 nan
600245 5679475 nan
599125 5679245 nan
599675 5678955 nan
598915 5678935 nan
599815 5678765 nan
598785 5678655 nan
600345 5678335 nan
600545 5678015 nan

Maciek

--
Martin Landa <landa.martin@gmail.com> * http://gama.fsv.cvut.cz/~landa *

Maciej Sieczka wrote:

>> (nan = not a number, GRASS uses NULL for the same idea, but NULL is not
>> the same as nan.)

FWIW, the bit patterns which GRASS uses to represent null values for
FCELL and DCELL (all-ones) are interpreted as NaN values when using
the IEEE-754 floating-point format. However, the converse isn't
necessarily true; IEEE-754 defines any value with an all-ones exponent
and a non-zero mantissa as NaN. IOW, GRASS' null values are IEEE-754
NaN values, but they aren't the only ones.

In particular, the NaN value resulting from e.g. 0.0/0 or domain
errors in math functions won't normally be the GRASS null value.

> agreed, for me, it seems like a bug (v.to.rast?), in any case a 'nan'
> value is an *invalid* cell value.

>> If there are nan in the cell data, the r.univar calculations can't cope.
>> The question is: how did nan cells get into the input raster map?

> v.to.rast ?

>> There is a possibility that there is a bug in r.univar..
>> Try using r.mapcalc to isolate the nans? What does "r.info -r" say?

> min=0.000000
> max=358.264295

It seems like a bug in v.to.rast. Take the location I attached, and do:

$ g.region vect=cieki10_2zl res=10 -a
$ v.to.rast input=cieki10_2zl output=cieki10_2zl_dir use=dir

There are several nan cells indeed:

Without having tried the data, my first guess is that the NaN values
result from a domain error in the flow-direction calculation,
specifically from the call to acos() in v2angle(), in do_lines.c.

The main problem is that you can't pass GRASS FP null values around
like normal values. When dealing with NaN values, the implementation
isn't required to maintain the specific bit pattern, only that
"NaN-ness" is preserved.

It would be easy enough to add a check to v2angle(), e.g.

  if (costheta < -1 || costheta > 1)
    acos() will return NaN

Maciej Sieczka wrote:

>> (nan = not a number, GRASS uses NULL for the same idea, but NULL is not
>> the same as nan.)

FWIW, the bit patterns which GRASS uses to represent null values for
FCELL and DCELL (all-ones) are interpreted as NaN values when using
the IEEE-754 floating-point format. However, the converse isn't
necessarily true; IEEE-754 defines any value with an all-ones exponent
and a non-zero mantissa as NaN. IOW, GRASS' null values are IEEE-754
NaN values, but they aren't the only ones.

In particular, the NaN value resulting from e.g. 0.0/0 or domain
errors in math functions won't normally be the GRASS null value.

> agreed, for me, it seems like a bug (v.to.rast?), in any case a 'nan'
> value is an *invalid* cell value.

>> If there are nan in the cell data, the r.univar calculations can't cope.
>> The question is: how did nan cells get into the input raster map?

> v.to.rast ?

>> There is a possibility that there is a bug in r.univar..
>> Try using r.mapcalc to isolate the nans? What does "r.info -r" say?

> min=0.000000
> max=358.264295

It seems like a bug in v.to.rast. Take the location I attached, and do:

$ g.region vect=cieki10_2zl res=10 -a
$ v.to.rast input=cieki10_2zl output=cieki10_2zl_dir use=dir

There are several nan cells indeed:

Without having tried the data, my first guess is that the NaN values
result from a domain error in the flow-direction calculation,
specifically from the call to acos() in v2angle(), in do_lines.c.

The main problem is that you can't pass GRASS FP null values around
like normal values. When dealing with NaN values, the implementation
isn't required to maintain the specific bit pattern, only that
"NaN-ness" is preserved.

It would be easy enough to add a check to v2angle(), e.g.

  if (costheta < -1 || costheta > 1)
  {
    /* acos() will return NaN */
  }

However, there's no way to report this; we can't just return GRASS'
null value, as the hardware may transform it to a different
bit-pattern en route.

We could try to detect the NaNs before the data is written to file,
but isnan() isn't in C89, and other ways of detecting NaNs are
unreliable. ANSI dictates that "x == x" is false if x is NaN, but some
compilers (particularly for architectures without full hardware FP
support, e.g. Alpha) overlook this (and others may optimise "x == x"
to true with certain optimisation switches).

The only robust and portable solution is to change v2angle(),
deg_angle() and set_dcat() to accept/return the value via a pointer
argument, to ensure that the all-ones bit pattern is preserved, and
change v2angle() to specifically check for out-of-domain values.

--
Glynn Clements <glynn@gclements.plus.com>