[GRASS-dev] r.mapcal rand() strangeness

$ r.mapcalc 'map=rand(-2147483648,2147483647)'

$ r.stats -1 map | sort -n | uniq
  100%
-2147483647
-2147483646

Why only 2 values from 4294967295 possible? Region is big enough to accomodate more than 2 values from the range:

$ g.region -g
n=4928055
s=4913670
w=589970
e=609000
nsres=5
ewres=5
rows=2877
cols=3806
cells=10949862

GRASS 6.3 SVN r30311 (latest).

Maciek

Maciej Sieczka wrote:

$ r.mapcalc 'map=rand(-2147483648,2147483647)'

$ r.stats -1 map | sort -n | uniq
  100%
-2147483647
-2147483646

Why only 2 values from 4294967295 possible? Region is big enough to
accomodate more than 2 values from the range:

  res[i] = (lo == hi) ? lo : lo + x % (hi - lo);

The expression (hi - lo) is overflowing the range of a signed integer.

Also, x will be restricted to 0 <= x < 2^31, possibly less if the
system doesn't have lrand48().

The wrapping can be fixed by adding casts, i.e.:

  res[i] = (lo == hi) ? lo : lo + (unsigned) x % (unsigned) (hi - lo);

The 31-bit limitation can be fixed by replacing lrand48() with
mrand48(), which covers the full (signed) 32-bit range.

The attached patch appears to work.

One final point: -2^31 (= 0x80000000 = -2147483648) is the null value
for the CELL type, so you'll never see that value in a map.

--
Glynn Clements <glynn@gclements.plus.com>

(attachments)

r.mapcalc-rand.diff (775 Bytes)

Glynn Clements pisze:

Maciej Sieczka wrote:

$ r.mapcalc 'map=rand(-2147483648,2147483647)'

$ r.stats -1 map | sort -n | uniq
  100%
-2147483647
-2147483646

Why only 2 values from 4294967295 possible? Region is big enough to accomodate more than 2 values from the range:

  res[i] = (lo == hi) ? lo : lo + x % (hi - lo);

The expression (hi - lo) is overflowing the range of a signed integer.

Also, x will be restricted to 0 <= x < 2^31, possibly less if the
system doesn't have lrand48().

The wrapping can be fixed by adding casts, i.e.:

  res[i] = (lo == hi) ? lo : lo + (unsigned) x % (unsigned) (hi - lo);

The 31-bit limitation can be fixed by replacing lrand48() with
mrand48(), which covers the full (signed) 32-bit range.

The attached patch appears to work.

Will you apply it?

One final point: -2^31 (= 0x80000000 = -2147483648) is the null value
for the CELL type, so you'll never see that value in a map.

BTW, what are the NULL value for double and float?

Maciek

Maciej Sieczka pisze:

Glynn Clements pisze:

Maciej Sieczka wrote:

$ r.mapcalc 'map=rand(-2147483648,2147483647)'

    res[i] = (lo == hi) ? lo : lo + x % (hi - lo);

The expression (hi - lo) is overflowing the range of a signed integer.

Are all GRASS modules limited to Int32, ie. -2147483648 to 214748364? I can see r.mapcalc silently forces 214748364 for anything bigger...

The wrapping can be fixed by adding casts, i.e.:

    res[i] = (lo == hi) ? lo : lo + (unsigned) x % (unsigned) (hi - lo);

The 31-bit limitation can be fixed by replacing lrand48() with
mrand48(), which covers the full (signed) 32-bit range.

The attached patch appears to work.

Will you apply it?

I forgot to write I tested it and confirm it seems OK.

One final point: -2^31 (= 0x80000000 = -2147483648) is the null value
for the CELL type, so you'll never see that value in a map.

BTW, what are the NULL value for double and float?

Maciek

Maciej Sieczka wrote:

>>> $ r.mapcalc 'map=rand(-2147483648,2147483647)'

>> res[i] = (lo == hi) ? lo : lo + x % (hi - lo);
>>
>> The expression (hi - lo) is overflowing the range of a signed integer.

Are all GRASS modules limited to Int32, ie. -2147483648 to 214748364? I
can see r.mapcalc silently forces 214748364 for anything bigger...

The limits are -214748367 to 214748367 inclusive.

If you aren't seeing anything larger than 214748364, it may just be
that you aren't taking enough samples. mrand48() etc use a 48-bit
state, so even taking 2^32 samples doesn't guarantee that you will see
all 2^32 possible values.

>> The wrapping can be fixed by adding casts, i.e.:
>>
>> res[i] = (lo == hi) ? lo : lo + (unsigned) x % (unsigned) (hi - lo);
>>
>> The 31-bit limitation can be fixed by replacing lrand48() with
>> mrand48(), which covers the full (signed) 32-bit range.
>>
>> The attached patch appears to work.

> Will you apply it?

I forgot to write I tested it and confirm it seems OK.

I've committed it.

>> One final point: -2^31 (= 0x80000000 = -2147483648) is the null value
>> for the CELL type, so you'll never see that value in a map.

> BTW, what are the NULL value for double and float?

The FP nulls are the all-ones bit patterns. These corresponds to NaN
according to the IEEE-754 formats, although it isn't the "default" NaN
pattern generated by most architectures (which is usually 7fc00000 or
ffc00000 for float and 7ff8000000000000 or fff8000000000000 for
double, i.e. an all-ones exponent, the top-bit of the mantissa set,
and either sign).

So far as arithmetic is concerned, any value with an all-ones exponent
and a non-zero mantissa is treated as NaN. But the GRASS
G_is_[fd]_null_value() functions only consider the all-ones bit
pattern to be null. I intend to change this in 7.x so that all FP NaN
values are treated as null. This will mean that code which can
generate NaNs doesn't have to explicitly convert them to the GRASS
null value.

--
Glynn Clements <glynn@gclements.plus.com>

Maciej Sieczka pisze:

Maciej Sieczka pisze:

Glynn Clements pisze:

Maciej Sieczka wrote:

$ r.mapcalc 'map=rand(-2147483648,2147483647)'

    res[i] = (lo == hi) ? lo : lo + x % (hi - lo);

The expression (hi - lo) is overflowing the range of a signed integer.

Are all GRASS modules limited to Int32, ie. -2147483648 to 214748364? I can see r.mapcalc silently forces 214748364 for anything bigger...

This is not entirely correct. More findings:

The following silently forces 2147483647 on anything bigger:

$ r.mapcalc 'map=9999999999999999999'
$ r.info -rt map
min=2147483647
max=2147483647
datatype=CELL

This one too, though I'm trying to enforce double:

$ r.mapcalc 'map=double(9999999999999999999)'
$ r.info -rt map
min=2147483647.000000
max=2147483647.000000
datatype=DCELL

Trying the same by adding a decimal place separator, results in something bizzare to me:

$ r.mapcalc 'map=99999999999999999999999.0'
$ r.info -rt map
min=99999999999999991611392.000000
max=99999999999999991611392.000000
datatype=DCELL

$ r.mapcalc 'map=9999999999999999999.0'
$ r.info -rt map
min=10000000000000000000.000000
max=10000000000000000000.000000
datatype=DCELL

At the edges of Float32 range other strange things happen:

Shouldn't the below be rather min,max=340282000000000000000000000000000000000.0?:

$ r.mapcalc 'map=3.40282e38'
$ r.info -rt map
min=340282000000000014192072600942972764160.000000
max=340282000000000014192072600942972764160.000000
datatype=DCELL

The the one below should be -340282000000000000000000000000000000000.0 I guess:

$ r.mapcalc 'map=-3.40282e38'
$ r.info -rt map
min=-340282000000000014192072600942972764160.000000
max=-340282000000000014192072600942972764160.000000
datatype=DCEL

At bigger Float64 values r.mapcalc crashes:

$r.mapcalc 'map=3.40282e110'

*** stack smashing detected ***: r.mapcalc terminated
Aborted (core dumped)

$ r.mapcalc 'map=1.79769e+308'
  100%
*** stack smashing detected ***: r.mapcalc terminated
Aborted (core dumped

BTW, what are the NULL value for double and float?

Still curious (to document it).

Maciek

Maciej Sieczka wrote:

>>>> $ r.mapcalc 'map=rand(-2147483648,2147483647)'

>>> res[i] = (lo == hi) ? lo : lo + x % (hi - lo);
>>>
>>> The expression (hi - lo) is overflowing the range of a signed integer.

> Are all GRASS modules limited to Int32, ie. -2147483648 to 214748364? I
> can see r.mapcalc silently forces 214748364 for anything bigger...

This is not entirely correct. More findings:

The following silently forces 2147483647 on anything bigger:

$ r.mapcalc 'map=9999999999999999999'
$ r.info -rt map
min=2147483647
max=2147483647
datatype=CELL

That is down to the atoi() function used to parse integers.

I could change it to use strtol(), which sets errno to ERANGE on
overflow.

This one too, though I'm trying to enforce double:

$ r.mapcalc 'map=double(9999999999999999999)'
$ r.info -rt map
min=2147483647.000000
max=2147483647.000000
datatype=DCELL

The over-large integer constant is truncated by atoi(), then the
truncated constant is converted to a double.

Trying the same by adding a decimal place separator, results in
something bizzare to me:

$ r.mapcalc 'map=99999999999999999999999.0'
$ r.info -rt map
min=99999999999999991611392.000000
max=99999999999999991611392.000000
datatype=DCELL

The presence of a dot causes the value to be parsed as a
floating-point value using atof().

A double has a precision of ~16 decimal digits, which matches what you
see above.

$ r.mapcalc 'map=9999999999999999999.0'
$ r.info -rt map
min=10000000000000000000.000000
max=10000000000000000000.000000
datatype=DCELL

Ditto.

At the edges of Float32 range other strange things happen:

Shouldn't the below be rather
min,max=340282000000000000000000000000000000000.0?:

$ r.mapcalc 'map=3.40282e38'
$ r.info -rt map
min=340282000000000014192072600942972764160.000000
max=340282000000000014192072600942972764160.000000
datatype=DCELL

The the one below should be -340282000000000000000000000000000000000.0 I
guess:

$ r.mapcalc 'map=-3.40282e38'
$ r.info -rt map
min=-340282000000000014192072600942972764160.000000
max=-340282000000000014192072600942972764160.000000
datatype=DCEL

Again, you're running into the limits of the precision of the double
type.

Also, bear in mind that the conversion from a binary FP value to a
string of decimal digits is being performed by r.info; r.mapcalc has
no control over that part of the process.

Essentially, the %f format behaves badly for large numbers. If you use
%e or %g, it will use exponential notation, with a specific relative
precision rather than a specific absolute precision.

At bigger Float64 values r.mapcalc crashes:

$r.mapcalc 'map=3.40282e110'

*** stack smashing detected ***: r.mapcalc terminated
Aborted (core dumped)

$ r.mapcalc 'map=1.79769e+308'
  100%
*** stack smashing detected ***: r.mapcalc terminated
Aborted (core dumped

Right. It's sprintf()ing the value into a 64-byte buffer using %f.

I'll change it to use %g:

--- raster/r.mapcalc/expression.c (revision 30318)
+++ raster/r.mapcalc/expression.c (working copy)
@@ -312,7 +312,7 @@
   if (e->res_type == CELL_TYPE)
     sprintf(buff, "%d", e->data.con.ival);
   else
- sprintf(buff, "%f", e->data.con.fval);
+ sprintf(buff, "%.8g", e->data.con.fval);

   return strdup(buff);
}

--
Glynn Clements <glynn@gclements.plus.com>

Glynn Clements pisze:

Maciej Sieczka wrote:

The following silently forces 2147483647 on anything bigger:

$ r.mapcalc 'map=9999999999999999999'
$ r.info -rt map
min=2147483647
max=2147483647
datatype=CELL

That is down to the atoi() function used to parse integers.

Sorry if I got it wrong - is r.mapcalc limited to 32 bit integers? If so, does it have to?

I could change it to use strtol(), which sets errno to ERANGE on
overflow.

Would that make r.mapcalc accept integers bigger than int32?

Trying the same by adding a decimal place separator, results in something bizzare to me:

$ r.mapcalc 'map=99999999999999999999999.0'
$ r.info -rt map
min=99999999999999991611392.000000
max=99999999999999991611392.000000
datatype=DCELL

The presence of a dot causes the value to be parsed as a
floating-point value using atof().

A double has a precision of ~16 decimal digits, which matches what you
see above.

So processing any number with more than 16 decimal diggits in r.mapcalc must yield such "strange" values? And how many diggits after the decimal separator are safe?

At the edges of Float32 range other strange things happen:

Shouldn't the below be rather min,max=340282000000000000000000000000000000000.0?:

$ r.mapcalc 'map=3.40282e38'
$ r.info -rt map
min=340282000000000014192072600942972764160.000000
max=340282000000000014192072600942972764160.000000
datatype=DCELL

The the one below should be -340282000000000000000000000000000000000.0 I guess:

$ r.mapcalc 'map=-3.40282e38'
$ r.info -rt map
min=-340282000000000014192072600942972764160.000000
max=-340282000000000014192072600942972764160.000000
datatype=DCEL

Again, you're running into the limits of the precision of the double
type.

Also, bear in mind that the conversion from a binary FP value to a
string of decimal digits is being performed by r.info; r.mapcalc has
no control over that part of the process.

Does that mean that actually in the raster the value is something else than what r.info reports?

Essentially, the %f format behaves badly for large numbers. If you use
%e or %g, it will use exponential notation, with a specific relative
precision rather than a specific absolute precision.

At bigger Float64 values r.mapcalc crashes:

$r.mapcalc 'map=3.40282e110'

*** stack smashing detected ***: r.mapcalc terminated
Aborted (core dumped)

Right. It's sprintf()ing the value into a 64-byte buffer using %f.

I'll change it to use %g:

That fixes it. Thanks. I have noticed that r.null has the same problem though. More modules could? To reproduce, please run the attached script (having set region big enough) and proceed utnil the last, Float64. step. r.null with crash with "*** stack smashing detected ***".

I wrote the script to validate Martin's patch for r.out.gdal [1] in extreme conditions but I'm running into issues, thus this thread.

[1]http://trac.osgeo.org/grass/attachment/ticket/67/r-out-gdal-no-data.diff

Maciek

(attachments)

datatypes_grass_gdal.sh (1.71 KB)

Maciej Sieczka <tutey@o2.pl> writes:

[...]

>> I could change it to use strtol(), which sets errno to ERANGE on
>> overflow.

  It may be better to check `*tailptr', like:

int
p_arg_long (const char *s, long *vp)
{
  char *t;
  long v;

  if ((v = strtol (s, &t, 0)),
      t == s || *t != '\0') {
    /* . */
    return -1;
  }
  if (vp != 0) *vp = v;

  /* . */
  return 0;
}

...

   if (p_arg_long (str, &val) != 0) {
     /* not a long integer */
   }

  Though `errno' may be used for finer diagnostics.

> Would that make r.mapcalc accept integers bigger than int32?

  The sizes of the C types generally depend on the platform, but
  for x86-based platforms I'm familiar with, sizeof (long) is 32.

  It may make sense to use `long long' and strtoll () where
  available.

[...]

Maciej Sieczka wrote:

>> The following silently forces 2147483647 on anything bigger:
>>
>> $ r.mapcalc 'map=9999999999999999999'
>> $ r.info -rt map
>> min=2147483647
>> max=2147483647
>> datatype=CELL

> That is down to the atoi() function used to parse integers.

Sorry if I got it wrong - is r.mapcalc limited to 32 bit integers? If
so, does it have to?

CELL maps are limited to 32-bit integers. There doesn't seem much
point in extending r.mapcalc to handle anything larger; too much work
for too little reward.

> I could change it to use strtol(), which sets errno to ERANGE on
> overflow.

Would that make r.mapcalc accept integers bigger than int32?

No; it would mean that out-of-range integers would produce an error
instead of being silently truncated to INT_MAX.

>> Trying the same by adding a decimal place separator, results in
>> something bizzare to me:
>>
>> $ r.mapcalc 'map=99999999999999999999999.0'
>> $ r.info -rt map
>> min=99999999999999991611392.000000
>> max=99999999999999991611392.000000
>> datatype=DCELL
>
> The presence of a dot causes the value to be parsed as a
> floating-point value using atof().
>
> A double has a precision of ~16 decimal digits, which matches what you
> see above.

So processing any number with more than 16 decimal diggits in r.mapcalc
must yield such "strange" values?

The precision of a "double" corresponds to roughly 16 significant
decimal digits (it's exactly 52 binary digits).

The problem arises when something (in this case, r.info) tries to
print numbers where the number of digits to the left of the decimal
point exceeds the precision.

If you use exponential notation, you can limit the number of digits to
match the precision, so the problem doesn't arise.

And how many diggits after the decimal
separator are safe?

Floating point numbers have a fixed relative error rather than a fixed
absolute error. The issue is the number of significant digits. The
position of the decimal point doesn't matter.

>> At the edges of Float32 range other strange things happen:
>>
>> Shouldn't the below be rather
>> min,max=340282000000000000000000000000000000000.0?:
>>
>> $ r.mapcalc 'map=3.40282e38'
>> $ r.info -rt map
>> min=340282000000000014192072600942972764160.000000
>> max=340282000000000014192072600942972764160.000000
>> datatype=DCELL
>>
>> The the one below should be -340282000000000000000000000000000000000.0 I
>> guess:
>>
>> $ r.mapcalc 'map=-3.40282e38'
>> $ r.info -rt map
>> min=-340282000000000014192072600942972764160.000000
>> max=-340282000000000014192072600942972764160.000000
>> datatype=DCEL

> Again, you're running into the limits of the precision of the double
> type.
>
> Also, bear in mind that the conversion from a binary FP value to a
> string of decimal digits is being performed by r.info; r.mapcalc has
> no control over that part of the process.

Does that mean that actually in the raster the value is something else
than what r.info reports?

What is in the raster is a binary floating point value:

  http://en.wikipedia.org/wiki/IEEE_floating-point

E.g. 3.40282e38 is stored as

  (9007189542424620/(2^53))*(2^128)
= 9007189542424620*(2^(128-53))
= 340282000000000014192072600942972764160

> Essentially, the %f format behaves badly for large numbers. If you use
> %e or %g, it will use exponential notation, with a specific relative
> precision rather than a specific absolute precision.

>> At bigger Float64 values r.mapcalc crashes:
>>
>> $r.mapcalc 'map=3.40282e110'
>>
>> *** stack smashing detected ***: r.mapcalc terminated
>> Aborted (core dumped)

> Right. It's sprintf()ing the value into a 64-byte buffer using %f.
>
> I'll change it to use %g:

That fixes it. Thanks. I have noticed that r.null has the same problem
though. More modules could?

Quite possibly.

To reproduce, please run the attached script
  (having set region big enough) and proceed utnil the last, Float64.
step. r.null with crash with "*** stack smashing detected ***".

That's a different problem. It's crashing in parse_d_mask_rule(), at
line 230:

    else if (sscanf (vallist,"%[^ -\t]-%lf", junk, &a) == 2)

The "%[^ -\t]" matches the entire string, which is 311 bytes long, but
the "junk" buffer is only 128 bytes. The fact that the value is
supposed to be a floating-point number doesn't actually matter.

If you use 1.79769e+308, it won't crash. Most sane people would use
exponential form rather than a >300-digit string (especially when only
~16 digits are significant).

The simplest fix is to add a maximum field width, e.g. "%100[^ -\t]".

BTW, that pattern is bogus. The dash (minus) is acting as a range
specifier, i.e. everything from space to tab inclusive. But space (32)
comes after tab (9). To match anything except a space, a dash or a
tab, the dash should come last.

--
Glynn Clements <glynn@gclements.plus.com>

Ivan Shmakov wrote:

>>>>> Maciej Sieczka <tutey@o2.pl> writes:

[...]

>> I could change it to use strtol(), which sets errno to ERANGE on
>> overflow.

  It may be better to check `*tailptr', like:

There's no point. The string which is being passed to atoi() is lex's
yytext variable, which is guaranteed to consist solely of decimal
digits:

  I [0-9]+
  
  ...
  
  {I} {
        yylval.ival = atoi(yytext);
        return INTEGER;
      }

strtol() only stops when it encounters an invalid character; it won't
stop just because it has read more than 9 digits. So, the endptr
argument will always point to the terminating NUL.

The issue is that atoi() doesn't provide any indication that it
clamped the result to INT_MAX (or INT_MIN), while strtol() sets errno.

> Would that make r.mapcalc accept integers bigger than int32?

  The sizes of the C types generally depend on the platform, but
  for x86-based platforms I'm familiar with, sizeof (long) is 32.

  It may make sense to use `long long' and strtoll () where
  available.

You can't store anything larger than an "int" in a CELL map, so there
isn't much point in supporting 64-bit integers within r.mapcalc. If
you need more digits for intermediate values, you may as well just use
a double and convert the final result with the int() function.

--
Glynn Clements <glynn@gclements.plus.com>

Glynn Clements pisze:

-2^31 (= 0x80000000 = -2147483648) is the null value
for the CELL type, so you'll never see that value in a map.

The FP nulls are the all-ones bit patterns. These corresponds to NaN
according to the IEEE-754 formats, although it isn't the "default" NaN
pattern generated by most architectures (which is usually 7fc00000 or
ffc00000 for float and 7ff8000000000000 or fff8000000000000 for
double, i.e. an all-ones exponent, the top-bit of the mantissa set,
and either sign).

So far as arithmetic is concerned, any value with an all-ones exponent
and a non-zero mantissa is treated as NaN. But the GRASS
G_is_[fd]_null_value() functions only consider the all-ones bit
pattern to be null. I intend to change this in 7.x so that all FP NaN
values are treated as null. This will mean that code which can
generate NaNs doesn't have to explicitly convert them to the GRASS
null value.

These should go to "Raster data processing in GRASS GIS". As I don't really understand the part about floating point, I won't do it myself. Somebody more savvy please do, putting it some simpler words if possible.

Cheers,
Maciek

Below follow details about CELL and DCELL datatypes in GRASS. It would be good to have them summarrised in GRASS raster intro IMHO; + FCELL specific notes. I'm not competent - anybody please do.

questions by me, Glynn replies:

CELL maps are limited to 32-bit integers. There doesn't seem much
point in extending r.mapcalc to handle anything larger; too much work
for too little reward.

something bizzare to me:

$ r.mapcalc 'map=99999999999999999999999.0'
$ r.info -rt map
min=99999999999999991611392.000000
max=99999999999999991611392.000000
datatype=DCELL

A double has a precision of ~16 decimal digits, which matches what you
see above.

So processing any number with more than 16 decimal diggits in r.mapcalc must yield such "strange" values?

The precision of a "double" corresponds to roughly 16 significant
decimal digits (it's exactly 52 binary digits).

The problem arises when something (in this case, r.info) tries to
print numbers where the number of digits to the left of the decimal
point exceeds the precision.

If you use exponential notation, you can limit the number of digits to
match the precision, so the problem doesn't arise.

And how many diggits after the decimal separator are safe?

Floating point numbers have a fixed relative error rather than a fixed
absolute error. The issue is the number of significant digits. The
position of the decimal point doesn't matter.

Does that mean that actually in the raster the value is something else than what r.info reports?

What is in the raster is a binary floating point value:

  http://en.wikipedia.org/wiki/IEEE_floating-point

E.g. 3.40282e38 is stored as

  (9007189542424620/(2^53))*(2^128)
= 9007189542424620*(2^(128-53))
= 340282000000000014192072600942972764160

Maciej Sieczka wrote:

Below follow details about CELL and DCELL datatypes in GRASS. It would
be good to have them summarrised in GRASS raster intro IMHO; + FCELL
specific notes. I'm not competent - anybody please do.

None of this belongs in the GRASS documentation, as it isn't specific
to GRASS. Anyone who wants to know the details of how floating point
works can look it up in a computing text book or on Wikipedia.

--
Glynn Clements <glynn@gclements.plus.com>

Glynn Clements pisze:

Maciej Sieczka wrote:

Below follow details about CELL and DCELL datatypes in GRASS. It would be good to have them summarrised in GRASS raster intro IMHO; + FCELL specific notes. I'm not competent - anybody please do.

None of this belongs in the GRASS documentation, as it isn't specific
to GRASS. Anyone who wants to know the details of how floating point
works can look it up in a computing text book or on Wikipedia.

How does one know which datatype described on Wikipedia is GRASS CELL, FCELL and DCELL?

Maciek

Maciej Sieczka wrote:

Below follow details about CELL and DCELL datatypes in GRASS. It
would be good to have them summarrised in GRASS raster intro IMHO;
+ FCELL specific notes.

...

>>> A double has a precision of ~16 decimal digits, which matches
>>> what you see above.

this stuff seems (to me) a bit too technical for the short "hello
raster" intro pages. is there another place we could put it?
libgis Doxygen comments -> the (under advertised) programmer's manual?
maybe that is too far away from the user..?

to take the idea further, when do you end describing how binary
computers deal with storing non-binary numbers? sure explain what GRASS
has used and the limits that imposes, maybe on a wiki page with links
to wikipedia articles on the general gotchas of modern C, x86,
endianness, etc. which may be natural to a computer scientist but
foriegn to an ecologist expecting a number to be number?

Hamish

      ____________________________________________________________________________________
Never miss a thing. Make Yahoo your home page.
http://www.yahoo.com/r/hs

Hi,
IMHO it's good to document system's capabilities. Raster intro page
could have short subsection "Technical details" where current raster
limitations are noted. Probably without long explanations (link to
wikipedia/progman would be OK). Still I think they should be in user
docs - those, who understand that, will be able to easy see if they
have hit GRASS celling but those, who don't understand, will ignore
such section, as they do with most of not understandable documentation
:slight_smile:

Maris.

Maciej Sieczka wrote:

>> Below follow details about CELL and DCELL datatypes in GRASS. It would
>> be good to have them summarrised in GRASS raster intro IMHO; + FCELL
>> specific notes. I'm not competent - anybody please do.

> None of this belongs in the GRASS documentation, as it isn't specific
> to GRASS. Anyone who wants to know the details of how floating point
> works can look it up in a computing text book or on Wikipedia.

How does one know which datatype described on Wikipedia is GRASS CELL,
FCELL and DCELL?

Oh, it's reasonable enough to document that (if it isn't already):

  CELL 32-bit signed integer
  FCELL IEEE single-precision floating-point
  DCELL IEEE double-precision floating-point

--
Glynn Clements <glynn@gclements.plus.com>

Hamish wrote:

> Below follow details about CELL and DCELL datatypes in GRASS. It
> would be good to have them summarrised in GRASS raster intro IMHO;
> + FCELL specific notes.
...
> >>> A double has a precision of ~16 decimal digits, which matches
> >>> what you see above.

this stuff seems (to me) a bit too technical for the short "hello
raster" intro pages. is there another place we could put it?

  http://en.wikipedia.org/wiki/Floating-point
  http://en.wikipedia.org/wiki/IEEE_754

libgis Doxygen comments -> the (under advertised) programmer's manual?
maybe that is too far away from the user..?

It doesn't belong in the programmer's manual any more than tutorials
on the C language or the Unix API do.

FWIW, the help file for Windows' calc.exe doesn't explain it either.

to take the idea further, when do you end describing how binary
computers deal with storing non-binary numbers? sure explain what GRASS
has used and the limits that imposes, maybe on a wiki page with links
to wikipedia articles on the general gotchas of modern C, x86,
endianness, etc. which may be natural to a computer scientist but
foriegn to an ecologist expecting a number to be number?

Where do you stop? Do we include tutorials on using bash (or cmd.exe
for Windows users), Unix filesystem permissions, grep/sed/awk (those
are useful for processing input and output to/from various GRASS
commands), statistics, ...?

--
Glynn Clements <glynn@gclements.plus.com>

I very much agree with Glynn on this one, I have already written to Maciek
something along this line.

If you are going to explain in GRASS man pages how computers handle
numbers should we then also explain (e.g. as part of r.slope.aspect),
how elevation is measured, what is the accuracy for different technologies,
what is the impact of resolution on slope estimate etc. - there would be no end,
because you would need to cover not only GIS, computer science, remote sensing,
physics .....

BTW the info about CELL, FCELL and DCELL is in the man pages.

Helena

P.S. And regarding the requirement that more 16 digits are supported
that got all of this started - where do you need it? I am sure there are
applications where you do, especially in numerical simulations but as
an example (if I am not wrong) when you measure distance between
moon and earth with micron accuracy you would need that many digits.

On Feb 27, 2008, at 8:09 AM, Glynn Clements wrote:

Hamish wrote:

Below follow details about CELL and DCELL datatypes in GRASS. It
would be good to have them summarrised in GRASS raster intro IMHO;
+ FCELL specific notes.

...

A double has a precision of ~16 decimal digits, which matches
what you see above.

this stuff seems (to me) a bit too technical for the short "hello
raster" intro pages. is there another place we could put it?

  http://en.wikipedia.org/wiki/Floating-point
  http://en.wikipedia.org/wiki/IEEE_754

libgis Doxygen comments -> the (under advertised) programmer's manual?
maybe that is too far away from the user..?

It doesn't belong in the programmer's manual any more than tutorials
on the C language or the Unix API do.

FWIW, the help file for Windows' calc.exe doesn't explain it either.

to take the idea further, when do you end describing how binary
computers deal with storing non-binary numbers? sure explain what GRASS
has used and the limits that imposes, maybe on a wiki page with links
to wikipedia articles on the general gotchas of modern C, x86,
endianness, etc. which may be natural to a computer scientist but
foriegn to an ecologist expecting a number to be number?

Where do you stop? Do we include tutorials on using bash (or cmd.exe
for Windows users), Unix filesystem permissions, grep/sed/awk (those
are useful for processing input and output to/from various GRASS
commands), statistics, ...?

--
Glynn Clements <glynn@gclements.plus.com>
_______________________________________________
grass-dev mailing list
grass-dev@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/grass-dev