[GRASS-dev] [GRASS GIS] #335: export floats and doubles with correct precision

#335: export floats and doubles with correct precision
-----------------------+----------------------------------------------------
Reporter: hamish | Owner: grass-dev@…
     Type: task | Status: new
Priority: critical | Milestone: 6.4.3
Component: Default | Version: svn-develbranch6
Keywords: precision | Platform: All
      Cpu: All |
-----------------------+----------------------------------------------------

Comment(by mmetz):

Replying to [comment:19 neteler]:
> Please backport to 6.4 since not much testing happens in 6.5 (but in
6.4.svn).
> (I see that many backports are missing!).

Glynn: Binary to text conversions must use %.9g (float) and %.17g (double)
in order to preserve fp values in a binary-decimal-binary round-trip, e.g.
r.out.ascii followed by r.in.ascii.

The modules [r|v].out.ascii need to be added to the list.

Markus M

--
Ticket URL: <http://trac.osgeo.org/grass/ticket/335#comment:21&gt;
GRASS GIS <http://grass.osgeo.org>

#335: export floats and doubles with correct precision
-----------------------+----------------------------------------------------
Reporter: hamish | Owner: grass-dev@…
     Type: task | Status: new
Priority: critical | Milestone: 6.4.4
Component: Default | Version: svn-develbranch6
Keywords: precision | Platform: All
      Cpu: All |
-----------------------+----------------------------------------------------
Changes (by hamish):

  * milestone: 6.4.3 => 6.4.4

Comment:

continued in 6.4.4 ...

--
Ticket URL: <https://trac.osgeo.org/grass/ticket/335#comment:22&gt;
GRASS GIS <http://grass.osgeo.org>

#335: export floats and doubles with correct precision
-----------------------+----------------------------------------------------
Reporter: hamish | Owner: grass-dev@…
     Type: task | Status: new
Priority: critical | Milestone: 6.4.4
Component: Default | Version: svn-develbranch6
Keywords: precision | Platform: All
      Cpu: All |
-----------------------+----------------------------------------------------

Comment(by hamish):

Replying to [comment:21 mmetz]:
> Glynn: Binary to text conversions must use %.9g (float) and %.17g
> (double) in order to preserve fp values in a binary-decimal-binary
> round-trip, e.g. r.out.ascii followed by r.in.ascii.

take care that preserving b-d-b round trip ,on a single platform, is not
the only task or consideration. For the r.out.ascii + r.in.ascii round
trip it may well be appropriate, but while conceding that point I'd argue
that r.*.bin or r.*.mat would be a better choice in those cases.

%.15,7g was chosen because it's perfectly reproducible and doesn't
introduce useless .0000000000000001 crap into the data files which
G_trim_decimal() can't handle. For things like r.univar, r.info, and
v.db.select the output is at least in part for human consumption; there's
no practical need to expose the FP noise. The main thing for us to focus
on I think is all the remaining lossy %f and meaningless %.56f type stuff
in the code, not splitting hairs over preserving quanta finer than
GRASS_EPSILON.

wrt lib/gis/color_write.c, look closely & you'll see there is a +/-
GRASS_EPSILON adjustment on the range to ensure that the rounding exceeds
the range, and so you shouldn't ever get white spots at the peaks and
pits.

best,
Hamish

(this is something I hope python cleans up with 3.0)

--
Ticket URL: <http://trac.osgeo.org/grass/ticket/335#comment:23&gt;
GRASS GIS <http://grass.osgeo.org>

#335: export floats and doubles with correct precision
-----------------------+----------------------------------------------------
Reporter: hamish | Owner: grass-dev@…
     Type: task | Status: new
Priority: critical | Milestone: 6.4.4
Component: Default | Version: svn-develbranch6
Keywords: precision | Platform: All
      Cpu: All |
-----------------------+----------------------------------------------------

Comment(by hamish):

I wrote:
> look closely & you'll see there is a +/- GRASS_EPSILON adjustment
> on the range

oops, I used it wrong, you already fixed it.
` (2^1023.9999999999999 - 2^1023.9999999999998) > GRASS_EPSILON `

Hamish

--
Ticket URL: <http://trac.osgeo.org/grass/ticket/335#comment:24&gt;
GRASS GIS <http://grass.osgeo.org>

#335: export floats and doubles with correct precision
-----------------------+----------------------------------------------------
Reporter: hamish | Owner: grass-dev@…
     Type: task | Status: new
Priority: critical | Milestone: 6.4.4
Component: Default | Version: svn-develbranch6
Keywords: precision | Platform: All
      Cpu: All |
-----------------------+----------------------------------------------------

Comment(by mmetz):

Replying to [comment:23 hamish]:
> Replying to [comment:21 mmetz]:
> > Glynn: Binary to text conversions must use %.9g (float) and %.17g
> > (double) in order to preserve fp values in a binary-decimal-binary
> > round-trip, e.g. r.out.ascii followed by r.in.ascii.
>
> take care that preserving b-d-b round trip ,on a single platform, is not
the only task or consideration. For the r.out.ascii + r.in.ascii round
trip it may well be appropriate, but while conceding that point I'd argue
that r.*.bin or r.*.mat would be a better choice in those cases.

Please add this information to the manuals if you think this is crucial.

>
> %.15,7g was chosen because it's perfectly reproducible

No, it is not. %.17,9g is reproducible. See IEEE 754 standard.

> and doesn't introduce useless .0000000000000001 crap

On what platform do you get this crap? %.17,9g does not produce this crap.

> into the data files which G_trim_decimal() can't handle.

G_trim_decimal() is not needed for %.17,9g.

> The main thing for us to focus on I think is all the remaining lossy %f
and meaningless %.56f type stuff in the code

Sure, therefore %.17,9g.

> not splitting hairs over preserving quanta finer than GRASS_EPSILON.

Huh? This thread is not about GRASS_EPSILON. GRASS_EPSILON is 1.0e-15. A
perfectly valid range of double fp data is e.g. 1.0e-200 to 1.0e-300, way
smaller than GRASS_EPSILON. You do not want to corrupt these data.

BTW, I found that fp calculation errors are rather in the range of 1.0e-4
to 1.0e-8.

Markus M

--
Ticket URL: <http://trac.osgeo.org/grass/ticket/335#comment:25&gt;
GRASS GIS <http://grass.osgeo.org>

#335: export floats and doubles with correct precision
-----------------------+----------------------------------------------------
Reporter: hamish | Owner: grass-dev@…
     Type: task | Status: new
Priority: critical | Milestone: 6.4.4
Component: Default | Version: svn-develbranch6
Keywords: precision | Platform: All
      Cpu: All |
-----------------------+----------------------------------------------------

Comment(by hamish):

> > %.15,7g was chosen because it's perfectly reproducible
>
> No, it is not. %.17,9g is reproducible. See IEEE 754 standard.

I didn't mean reproducible as far as the binary stored value round-trip
was concerned, I meant reproducible as far as getting the same ascii
result on two different hardware platforms.

for example, getting r.md5sum (addon to aid the test_suite) to give the
same (portable) result when checksumming r.out.ascii output when the data
is stored as floating point &/or not strictly real (2.5000000). My
understanding, for what it is, is that the least sig. digits can flicker
depending on the CPU arch, perhaps the compiler, and programming
language's implementation too.

I can't remember which right now, but one of the ogr/gdal/proj4 input
streams was introducing .00000000001 style artifacts, it's probably worth
digging that up as a real-world test case example as
there are probably two classes of this problem with the same symptom: one
to do with GRASS's use of %f and atof() in places, and another to do with
the output of input values which were malformed before they came into
grass.

Another interesting corner-case example to try with this is the earthquake
demo from the grass-promo/tutorials/ svn, where the heavily logarithmic
data scale within a single dataset really stretches what fits well in the
ieee FP model, and so something more linear like magnitude is used to
store it instead of raw extajoules.

> A perfectly valid range of double fp data is e.g. 1.0e-200
> to 1.0e-300, way smaller than GRASS_EPSILON. You do not want to
> corrupt these data.

(see the following post with the `2^1024` example where I corrected myself
about that after seeing how you fixed the error in my earlier attempt to
make sure the color min/max rule was bigger than the data range)

regards,
Hamish

--
Ticket URL: <http://trac.osgeo.org/grass/ticket/335#comment:26&gt;
GRASS GIS <http://grass.osgeo.org>

#335: export floats and doubles with correct precision
-----------------------+----------------------------------------------------
Reporter: hamish | Owner: grass-dev@…
     Type: task | Status: new
Priority: critical | Milestone: 6.4.4
Component: Default | Version: svn-develbranch6
Keywords: precision | Platform: All
      Cpu: All |
-----------------------+----------------------------------------------------

Comment(by glynn):

Replying to [comment:26 hamish]:

> > No, it is not. %.17,9g is reproducible. See IEEE 754 standard.
>
> I didn't mean reproducible as far as the binary stored value round-trip
was concerned, I meant reproducible as far as getting the same ascii
result on two different hardware platforms.

All common platforms use IEEE-754 representation, so I wouldn't expect
differences due to hardware.

> My understanding, for what it is, is that the least sig. digits can
flicker depending on the CPU arch, perhaps the compiler, and programming
language's implementation too.

Any differences are due to software, not hardware. The main reasons for
differences are:

1. Whether the software produces the closest decimal value to the actual
binary value, or the shortest decimal value which would convert to the
actual binary value (those two aren't necessarily the same).

2. The rounding mode used in the event of a tie. E.g. 3.0/16=0.1875; if
that is displayed with 3 fractional digits, should the result be 0.187
(round toward zero, round toward negative infinity) or 0.188 (round toward
positive infinity, round toward nearest even value)?

3. Bugs. Many programmers don't understand the details of floating-point,
and don't particularly care whether the least-significant digits are
"correct" (particularly as that would requiring which flavour of "correct"
you actually want).

--
Ticket URL: <http://trac.osgeo.org/grass/ticket/335#comment:27&gt;
GRASS GIS <http://grass.osgeo.org>

#335: export floats and doubles with correct precision
-----------------------+----------------------------------------------------
Reporter: hamish | Owner: grass-dev@…
     Type: task | Status: new
Priority: critical | Milestone: 6.4.3
Component: Default | Version: svn-develbranch6
Keywords: precision | Platform: All
      Cpu: All |
-----------------------+----------------------------------------------------
Changes (by neteler):

  * milestone: 6.4.4 => 6.4.3

Comment:

Replying to [comment:27 glynn]:
> Replying to [comment:26 hamish]:
> > Replying to mmetz:
> > > No, it is not. %.17,9g is reproducible. See IEEE 754 standard.
> >
> > I didn't mean reproducible as far as the binary stored value round-
trip was concerned, I meant reproducible as far as getting the same ascii
result on two different hardware platforms.
>
> All common platforms use IEEE-754 representation, so I wouldn't expect
differences due to hardware.

Should r55122 be reverted then?

--
Ticket URL: <http://trac.osgeo.org/grass/ticket/335#comment:28&gt;
GRASS GIS <http://grass.osgeo.org>

#335: export floats and doubles with correct precision
-----------------------+----------------------------------------------------
Reporter: hamish | Owner: grass-dev@…
     Type: task | Status: new
Priority: critical | Milestone: 6.4.3
Component: Default | Version: svn-develbranch6
Keywords: precision | Platform: All
      Cpu: All |
-----------------------+----------------------------------------------------

Comment(by mmetz):

Replying to [comment:28 neteler]:
> Replying to [comment:27 glynn]:
> > Replying to [comment:26 hamish]:
> > > Replying to mmetz:
> > > > No, it is not. %.17,9g is reproducible. See IEEE 754 standard.
> > >
> > > I didn't mean reproducible as far as the binary stored value round-
trip was concerned, I meant reproducible as far as getting the same ascii
result on two different hardware platforms.
> >
> > All common platforms use IEEE-754 representation, so I wouldn't expect
differences due to hardware.
>
> Should r55122 be reverted then?

Reverted in r55183.

Markus M

--
Ticket URL: <http://trac.osgeo.org/grass/ticket/335#comment:29&gt;
GRASS GIS <http://grass.osgeo.org>

#335: export floats and doubles with correct precision
-----------------------+----------------------------------------------------
Reporter: hamish | Owner: grass-dev@…
     Type: task | Status: new
Priority: critical | Milestone: 6.4.4
Component: Default | Version: svn-develbranch6
Keywords: precision | Platform: All
      Cpu: All |
-----------------------+----------------------------------------------------
Changes (by hamish):

  * milestone: 6.4.3 => 6.4.4

Comment:

to be continued ..

--
Ticket URL: <https://trac.osgeo.org/grass/ticket/335#comment:30&gt;
GRASS GIS <http://grass.osgeo.org>

#335: export floats and doubles with correct precision
-----------------------+----------------------------------------------------
Reporter: hamish | Owner: grass-dev@…
     Type: task | Status: new
Priority: critical | Milestone: 6.4.4
Component: Default | Version: svn-develbranch6
Keywords: precision | Platform: All
      Cpu: All |
-----------------------+----------------------------------------------------

Comment(by mlennert):

Replying to [comment:30 hamish]:
> to be continued ..

Hamish, what exactly needs to be continued here ? Is it fixing those
modules on your list that haven't been fixed, yet ? Or is it the
discussion on how to handle decimals ?

Moritz

--
Ticket URL: <https://trac.osgeo.org/grass/ticket/335#comment:31&gt;
GRASS GIS <http://grass.osgeo.org>

#335: export floats and doubles with correct precision
-----------------------+----------------------------------------------------
Reporter: hamish | Owner: grass-dev@…
     Type: task | Status: new
Priority: critical | Milestone: 6.4.5
Component: Default | Version: svn-develbranch6
Keywords: precision | Platform: All
      Cpu: All |
-----------------------+----------------------------------------------------
Changes (by hamish):

  * milestone: 6.4.4 => 6.4.5

Comment:

Replying to [comment:31 mlennert]:
> Hamish, what exactly needs to be continued here ?

the systematic audit and repair of modules printf'ing double and single
precision FP variables with inappropriate %.*g or %.*f, either arbitrarily
fixed-precision or using a precision appropriate for the data type.

> Is it fixing those modules on your list that haven't been fixed, yet ?

yes, exactly. spin boxes and (eg rectifier) table throughout the wxGUI is
related, but probably should have its own ticket.

> Or is it the discussion on how to handle decimals ?

I doubt that has an end, but at least we can all agree that %.52f is
meaningless and %f is too lossy.

The question of whether to output values with slight rounding, user
selectable precision, or try for exact ascii representation of IEEE double
FP is I think a case by case question.

e.g. %.15g will avoid outputting coordinates like `278.700000000000045
44.8999999999999986`, which is human-ugly and bloats the ascii filesize a
lot, but reversible to the same IEEE double FP value.

fwiw I think we can achieve good defaults so user selectable precision is
rarely needed (success is if no one ever thinks to ask for it, since the
right thing was done automatically), and there are times like the colr/
file when %.15g max and min range rounded outwards by 1 epsilon step can
tidy up the out of range trouble which happens from time to time.

the bug is of high priority as data input/output needs to be flawless, or
of such minor jitter that in most cases a nanometer here or there won't
harm the overall result.

regards,
Hamish

--
Ticket URL: <https://trac.osgeo.org/grass/ticket/335#comment:32&gt;
GRASS GIS <http://grass.osgeo.org>

#335: export floats and doubles with correct precision
-----------------------+------------------------------
  Reporter: hamish | Owner: grass-dev@…
      Type: task | Status: new
  Priority: critical | Milestone: 6.4.6
Component: Default | Version: svn-develbranch6
Resolution: | Keywords: precision
       CPU: All | Platform: All
-----------------------+------------------------------

Comment (by mlennert):

This is probably relevant for g7 as well, no ?

So probably the Version and Milestone settings should be pushed up to
trunk in order to raise the visibility of this issue.

--
Ticket URL: <https://trac.osgeo.org/grass/ticket/335#comment:35&gt;
GRASS GIS <http://grass.osgeo.org>