[GRASS-dev] nasty export of doubles

Hi,

this change
  http://trac.osgeo.org/grass/changeset/31236
introduced an unfortunate problem, whereas imported doubles have their
system level precision exposed (%.14f).

the general goal is that upon presentation by d.what.vect, v.db.select, or
whatever trailing 0s are nicely snipped away ala G_trim_decimal(); and
more over that in reasonable cases input == output. That's a reassuring
that the rest of the internals are ok, otherwise it is hard to spot when
something has gone wrong.

compare input:

cat|station|sounder_m|lat|lon|easting|northing
1|DC|94.3|-45°27.4492'|167°09.2427'|2053056.83|5511584.74
2|CA|86.4|-45°24.2306'|166°59.0481'|2039364.05|5516567.25
3|MR|407|-45°18.6105'|166°57.9204'|2037128.89|5526840.51
4|BA|167.6|-45°33.5942'|166°54.6907'|2034985.01|5498850.4
5|SC|168.2|-45°43.6784'|166°55.8538'|2037880.75|5480334.83
6|LS|365.9|-45°59.0538'|166°48.5873'|2030638.43|5451227.37

with old lib/db/dbmi_base/valuefmt.c (%lf) output via v.db.select:

cat|station|sounder_m|lat|lon|easting|northing
1|DC|94.3|-45°27.4492'|167°09.2427'|2053056.83|5511584.74
2|CA|86.4|-45°24.2306'|166°59.0481'|2039364.05|5516567.25
3|MR|407|-45°18.6105'|166°57.9204'|2037128.89|5526840.51
4|BA|167.6|-45°33.5942'|166°54.6907'|2034985.01|5498850.4
5|SC|168.2|-45°43.6784'|166°55.8538'|2037880.75|5480334.83
6|LS|365.9|-45°59.0538'|166°48.5873'|2030638.43|5451227.37

(they're identical)

and then with the current valuefmt.c (%.14f) output:

cat|station|sounder_m|lat|lon|easting|northing
1|DC|94.3|-45°27.4492'|167°09.2427'|2053056.83000000007451|5511584.74000000022352
2|CA|86.40000000000001|-45°24.2306'|166°59.0481'|2039364.05000000004657|5516567.25
3|MR|407|-45°18.6105'|166°57.9204'|2037128.88999999989755|5526840.50999999977648
4|BA|167.59999999999999|-45°33.5942'|166°54.6907'|2034985.01000000000931|5498850.40000000037253
5|SC|168.19999999999999|-45°43.6784'|166°55.8538'|2037880.75|5480334.83000000007451
6|LS|365.89999999999998|-45°59.0538'|166°48.5873'|2030638.42999999993481|5451227.37000000011176

but if we use %lf, beyond %.6f would be rounded/truncated??
As demonstrated by northing 5498850.40000000037253 above, using %.14f
just guarantees 14 numbers after the ".", not add any more significant
digits.

so, given that projected coordinates in the range of millions, to 3
decimal places (mm), will be common -- how to tell the string to be
a max of 15 digits total, not assume some fixed number AFTER the "."
will cover it? is there some %15.15f or so?

In the past we tried %g there, but it was not suitable because the
easting/northing would jump into e+06, which is no good. (and
v.in.ascii created db tables will usually include easting and
northings)

I don't know if there is a "right" answer here, my vote would be to
go back to %lf as the lesser of the two evils.

again, this is about exposed numbers (to a DB or ascii), not
internally stored doubles for calculation, and I'd expect the
current %.14f to mislead some SQL (where FOO > 10.0) style query
sooner rather than later.

related; this change looks more reasonable:
  http://trac.osgeo.org/grass/changeset/25355

fwiw, %.8f should give about 1mm for lat/lon, which is beyond current
RTK GPS capability and what is meaningful for a lot of projection
math (as we are constantly reminded on the proj4 list).

while keeping in mind that people will press grass into non-geographic
tasks that we haven't considered, where those small ranges could be
useful. (although in those cases I'd hope they'd edit PROJ_UNITS to
use microns or whatever)

thanks,
Hamish

Hamish wrote:

this change
  http://trac.osgeo.org/grass/changeset/31236
introduced an unfortunate problem, whereas imported doubles have their
system level precision exposed (%.14f).

the general goal is that upon presentation by d.what.vect, v.db.select, or
whatever trailing 0s are nicely snipped away ala G_trim_decimal(); and
more over that in reasonable cases input == output. That's a reassuring
that the rest of the internals are ok, otherwise it is hard to spot when
something has gone wrong.

compare input:

cat|station|sounder_m|lat|lon|easting|northing
1|DC|94.3|-45°27.4492'|167°09.2427'|2053056.83|5511584.74
2|CA|86.4|-45°24.2306'|166°59.0481'|2039364.05|5516567.25
3|MR|407|-45°18.6105'|166°57.9204'|2037128.89|5526840.51
4|BA|167.6|-45°33.5942'|166°54.6907'|2034985.01|5498850.4
5|SC|168.2|-45°43.6784'|166°55.8538'|2037880.75|5480334.83
6|LS|365.9|-45°59.0538'|166°48.5873'|2030638.43|5451227.37

with old lib/db/dbmi_base/valuefmt.c (%lf) output via v.db.select:

cat|station|sounder_m|lat|lon|easting|northing
1|DC|94.3|-45°27.4492'|167°09.2427'|2053056.83|5511584.74
2|CA|86.4|-45°24.2306'|166°59.0481'|2039364.05|5516567.25
3|MR|407|-45°18.6105'|166°57.9204'|2037128.89|5526840.51
4|BA|167.6|-45°33.5942'|166°54.6907'|2034985.01|5498850.4
5|SC|168.2|-45°43.6784'|166°55.8538'|2037880.75|5480334.83
6|LS|365.9|-45°59.0538'|166°48.5873'|2030638.43|5451227.37

(they're identical)

and then with the current valuefmt.c (%.14f) output:

cat|station|sounder_m|lat|lon|easting|northing
1|DC|94.3|-45°27.4492'|167°09.2427'|2053056.83000000007451|5511584.74000000022352
2|CA|86.40000000000001|-45°24.2306'|166°59.0481'|2039364.05000000004657|5516567.25
3|MR|407|-45°18.6105'|166°57.9204'|2037128.88999999989755|5526840.50999999977648
4|BA|167.59999999999999|-45°33.5942'|166°54.6907'|2034985.01000000000931|5498850.40000000037253
5|SC|168.19999999999999|-45°43.6784'|166°55.8538'|2037880.75|5480334.83000000007451
6|LS|365.89999999999998|-45°59.0538'|166°48.5873'|2030638.42999999993481|5451227.37000000011176

but if we use %lf, beyond %.6f would be rounded/truncated??

Yep.

As demonstrated by northing 5498850.40000000037253 above, using %.14f
just guarantees 14 numbers after the ".", not add any more significant
digits.

so, given that projected coordinates in the range of millions, to 3
decimal places (mm), will be common -- how to tell the string to be
a max of 15 digits total, not assume some fixed number AFTER the "."
will cover it? is there some %15.15f or so?

Nope. The closest is %.15g, which will use up to 15 significant digits
(use %#.15g to avoid stripping trailing zeroes), but it will use
exponential notation if the exponent is greater than or equal to the
precision (15 in this case) or less than -4.

If you want a fixed number of significant digits and no exponential
notation, you have to do it yourself, e.g. use %.100f and take the
first N characters or everything up to the decimal point, whichever is
longer.

In the past we tried %g there, but it was not suitable because the
easting/northing would jump into e+06, which is no good. (and
v.in.ascii created db tables will usually include easting and
northings)

If you use a larger precision, the exponent at which it switches to
exponential notation will increase accordingly. For coordinates, %.8g
will be sufficient, but values very close to zero (less than 0.1mm)
will still use exponential notation).

I don't know if there is a "right" answer here, my vote would be to
go back to %lf as the lesser of the two evils.

Too much precision is ugly. Too little precision makes it impossible
to use small values. A general-purpose conversion routine such as this
needs to handle the full range of floating-point values, which
basically means %.15g.

If you want neater output, the module needs to perform its own
formatting based upon the expected range of the data. The module knows
what the values represent, the DBMI library doesn't.

again, this is about exposed numbers (to a DB or ascii), not
internally stored doubles for calculation, and I'd expect the
current %.14f to mislead some SQL (where FOO > 10.0) style query
sooner rather than later.

related; this change looks more reasonable:
  http://trac.osgeo.org/grass/changeset/25355

Yeah, but in that situation, you know that the values are coordinates,
so you know the approximate range and precision of the values.

FP values stored in a database don't have to be coordinates. In fact,
they probably *aren't* coordinates.

--
Glynn Clements <glynn@gclements.plus.com>