#123: r.in.xyz: import bug when using scanned extent
---------------------+------------------------------------------------------
Reporter: neteler | Owner: grass-dev@lists.osgeo.org
Type: defect | Status: new
Priority: major | Milestone: 6.4.0
Component: default | Version: svn-trunk
Keywords: |
---------------------+------------------------------------------------------
When using r.in.xyz in script style, points may get lost during import:
Import script:
{{{
# for i in */*.txt
for i in d325095655/p325099657.txt
do
REGION=`r.in.xyz d325095655/p325099657.txt out=dummy fs=space -i --o -sg
| cut -d' ' -f1-4`
g.region $REGION res=1 -p
r.in.xyz $i out=`basename $i` fs=space -i --o
done
}}}
Result:
{{{
GRASS 6.3.0svn (pat):~/data/lidar_PAT_raw/raw > sh import_all.sh
projection: 1 (UTM)
zone: 32
datum: wgs84
ellipsoid: wgs84
north: 5099457.35
south: 5099071.19
west: 657729.49
east: 657953.44
nsres: 1.00041451
ewres: 0.99977679
rows: 386
cols: 224
cells: 86464
Scanning data ...
Writing to map ...
100%
r.in.xyz complete. 3 points found in region.
}}}
#123: r.in.xyz: import bug when using scanned extent
---------------------+------------------------------------------------------
Reporter: neteler | Owner: grass-dev@lists.osgeo.org
Type: defect | Status: new
Priority: major | Milestone: 6.4.0
Component: default | Version: svn-trunk
Keywords: |
---------------------+------------------------------------------------------
When using r.in.xyz in script style, points may get lost during import:
Import script:
{{{
# for i in */*.txt
for i in d325095655/p325099657.txt
do
REGION=`r.in.xyz d325095655/p325099657.txt out=dummy fs=space -i --o -sg
| cut -d' ' -f1-4`
g.region $REGION res=1 -p
r.in.xyz $i out=`basename $i` fs=space -i --o
done
}}}
Result:
{{{
GRASS 6.3.0svn (pat):~/data/lidar_PAT_raw/raw > sh import_all.sh
projection: 1 (UTM)
zone: 32
datum: wgs84
ellipsoid: wgs84
north: 5099457.35
south: 5099071.19
west: 657729.49
east: 657953.44
nsres: 1.00041451
ewres: 0.99977679
rows: 386
cols: 224
cells: 86464
Scanning data ...
Writing to map ...
100%
r.in.xyz complete. 3 points found in region.
}}}
Which one? If it's the first line (the southernmost point), that's
probably due to this (raster/r.in.xyz/main.c:588):
if (y <= pass_south || y > pass_north) {
The problem with simply changing the test to < is that, for multiple
passes, any points on the boundary would be counted twice. The test
should be < for the last pass and <= for other passes.
Even for floating-point, copying a value and testing a value for
equality with itself should work as expected. Rounding errors don't
appear out of nowhere, and neither of those operations should
introduce rounding errors.
If GRASS is introducing rounding errors, the first step should be to
try to eliminate them (e.g. ensuring that decimal representations have
sufficient digits), rather than working around such errors by adding
tolerances.
g.region -a should expand outwards, so no points will be lost.
I will work on a r.in.xyz.auto addon script to scan, set the region, and
import all in one step. The problem with that is that res= is unknowable
and requires supervision.
(I will say more about that in trac bug #37)
On grass-dev, Glynn answered:
> > -> one line get's lost.
>
> Which one? If it's the first line (the southernmost point), that's
> probably due to this (raster/r.in.xyz/main.c:588):
>
> if (y <= pass_south || y > pass_north) {
>
> The problem with simply changing the test to < is that, for multiple
> passes, any points on the boundary would be counted twice. The test
> should be < for the last pass and <= for other passes.
>
> Even for floating-point, copying a value and testing a value for
> equality with itself should work as expected. Rounding errors don't
> appear out of nowhere, and neither of those operations should
> introduce rounding errors.
>
> If GRASS is introducing rounding errors, the first step should be to
> try to eliminate them (e.g. ensuring that decimal representations have
> sufficient digits), rather than working around such errors by adding
> tolerances.
Yes, see this comment in r.in.xyz/main.c ("GE" refers to GRASS_EPSILON)
{{{
/* The range should be [0,cols-1]. We use (int) to round down,
but if the point exactly on eastern edge arr_col will be /just/
on the max edge .0000000 and end up on the next row.
We could make above bounds check "if(x>=region.east) continue;"
But instead we go to all sorts of trouble so that not one single
data point is lost. GE is too small to catch them all.
We don't try to make y happy as percent segmenting will make some
points happen twice that way; so instead we use the y<= test above.
*/
}}}
and in description.html:
{{{
<h4>Filtering</h4>
Points falling outside the current region will be skipped. This includes
points falling <em>exactly</em> on the southern region bound.
(to capture those adjust the region with "<tt>g.region s=s-0.000001</tt>";
see <em>g.region</em>)
}}}
I use the scan option to get the region. Using that I lose points. Looks
like a
precision bug to me (my user hat on). As you can see in my original
report, I
also used -a for g.region which doesn't make much sense to me if we
provide
the scan option for creating the g.region parameter string.
Dev hat on I hope that we find a solution for the situation described in
"Filtering".
Replying to [comment:8 hamish]:
> I'll have a look at r.in.xyz to see if we can change the < to <= ''if we
are on the last pass block'' without harming module speed much.
>
> If we do it for all passes there will be duplicated cells for points
which fall exactly on pass block borders.
>
From a mathematical perspective (is this user hat or dev hat?), the
current behaviour is correct.
The user has to decide on a resolution anyway when setting the import
region, and in order to get nice clean values, alignment to the resolution
is recommended, i.e. have exactly 1 instead of e.g. 0.98990209. In scan
mode, the module could print a recommended resolution guestimated from the
extends and number of points (works only for evenly spaced points), which
can optionally be rounded to integer values.
Since the region settings including resolution and alignment, optionally
to the geometry of an existing raster map, are such a core concept of
GRASS raster processing, it may not be asked too much to manually set the
region including resolution first before importing, and I am inclined to
close the ticket as invalid.