[GRASS-dev] [GRASS GIS] #37: r.in.xyz increase region based on input data

#37: r.in.xyz increase region based on input data
-------------------------+--------------------------------------------------
Reporter: marisn | Owner: grass-dev@lists.osgeo.org
     Type: enhancement | Status: new
Priority: minor | Milestone: 6.4.0
Component: default | Version: unspecified
Keywords: |
-------------------------+--------------------------------------------------
It would be nice, if r.in.xyz would support increasing region based on
input data like "v.in.ogr -e" does.

--
Ticket URL: <http://trac.osgeo.org/grass/ticket/37&gt;
GRASS GIS <http://grass.osgeo.org>
GRASS Geographic Information System (GRASS GIS) - http://grass.osgeo.org/

#37: r.in.xyz increase region based on input data
--------------------------+-------------------------------------------------
  Reporter: marisn | Owner: hamish
      Type: enhancement | Status: assigned
  Priority: minor | Milestone: 6.4.0
Component: default | Version: unspecified
Resolution: | Keywords:
--------------------------+-------------------------------------------------
Changes (by hamish):

  * status: new => assigned
  * owner: grass-dev@lists.osgeo.org => hamish
* cc: grass-dev@lists.osgeo.org (added)

Comment:

Hi,

While I do see the merit of the convenience, I am not especially inclined
to have that feature added. IMHO the first r.in.xyz run with -s should be
followed by "g.region n= s= e= w= res= -a" followed by a "r.in.xyz
method=n" and "r.univar map=n" to tune the region extent and resolution to
something suitable before proceeding.
aka the "Eat your vegetables" response.

Besides failing the "do one thing well" test, I think there was some goal
to have only the g.region module be able to change the mapset's region.
(other modules may change it internally for the duration of execution but
that should not affect other modules or the $MAPSET/WIND file)

Hamish

--
Ticket URL: <http://trac.osgeo.org/grass/ticket/37#comment:1&gt;
GRASS GIS <http://grass.osgeo.org>
GRASS Geographic Information System (GRASS GIS) - http://grass.osgeo.org/

#37: r.in.xyz increase region based on input data
--------------------------+-------------------------------------------------
  Reporter: marisn | Owner: hamish
      Type: enhancement | Status: assigned
  Priority: minor | Milestone: 6.4.0
Component: default | Version: unspecified
Resolution: | Keywords:
--------------------------+-------------------------------------------------
Comment (by epatton):

I wrote a shell script wrapper around 'r.in.xyz -s' specifically to set
set the region bounds to the maximum extent of the input data. The script
also lets you enter in the desired resolution for the g.region call.
Anyone here could probably write a similar one in 5 minutes, which is why
I never posted it before, but I could post it if anyone wants. Where I
work with so much xyz data on a regular basis, it does save me a lot of
"g.region n= s=", etc.

~ Eric.

--
Ticket URL: <http://trac.osgeo.org/grass/ticket/37#comment:2&gt;
GRASS GIS <http://grass.osgeo.org>
GRASS Geographic Information System (GRASS GIS) - http://grass.osgeo.org/

#37: r.in.xyz increase region based on input data
--------------------------+-------------------------------------------------
  Reporter: marisn | Owner: hamish
      Type: enhancement | Status: assigned
  Priority: minor | Milestone: 6.4.0
Component: default | Version: unspecified
Resolution: | Keywords:
--------------------------+-------------------------------------------------
Comment (by glynn):

r.in.xyz shouldn't change the current region.

What it probably *should* do is to create a map whose size is based upon
the input data, rather than using the current region. IOW, the way that
other r.in.* commands work; unlike most r.* commands, the output from
r.in.* commands isn't normally based upon the current region.

This would require two passes, which won't work with input=- if stdin is a
pipe. But then the same is true of a script which runs r.in.xyz -s,
g.region, r.in.xyz.

A script approach should definitely use either $WIND_OVERRIDE or
$GRASS_REGION rather than modifying WIND.

On a related note, r.in.xyz should probably have some way to explicitly
specify the bounds and resolution of the created map, either separate
n=/s=/e=/w=/res= options and/or a region= option to use a named region.

--
Ticket URL: <http://trac.osgeo.org/grass/ticket/37#comment:3&gt;
GRASS GIS <http://grass.osgeo.org>
GRASS Geographic Information System (GRASS GIS) - http://grass.osgeo.org/

Moving discussion into more general plane.
Is there some document where general GRASS module policies are
described? If such document does not exist, probably we should create
one (programmers manual? trac wiki?). Such document could contain
short must/should/suggested things to unify/make similar various
module behavior. It could be used by developers while developing new
modules or cleaning/fixing existing ones.
Some candidates (just examples): "Scripts should not touch WIND file
but use $WIND_OVERRIDE instead"; "r.in.* modules should import data
in their native resolution (if possible) ignoring current region
settings. They may provide region controlled import mode activated by
-? flag" etc.

Just trying to make GRASS better,
Maris.

2008/2/6, GRASS GIS <trac@osgeo.org>:

r.in.xyz shouldn't change the current region.

What it probably *should* do is to create a map whose size is based upon
the input data, rather than using the current region. IOW, the way that
other r.in.* commands work; unlike most r.* commands, the output from
r.in.* commands isn't normally based upon the current region.

This would require two passes, which won't work with input=- if stdin is a
pipe. But then the same is true of a script which runs r.in.xyz -s,
g.region, r.in.xyz.

A script approach should definitely use either $WIND_OVERRIDE or
$GRASS_REGION rather than modifying WIND.

#37: r.in.xyz increase region based on input data
--------------------------+-------------------------------------------------
  Reporter: marisn | Owner: hamish
      Type: enhancement | Status: assigned
  Priority: minor | Milestone: 6.4.0
Component: default | Version: unspecified
Resolution: | Keywords:
--------------------------+-------------------------------------------------
Comment (by hamish):

Glynn:
> What it probably *should* do is to create a map whose size is
> based upon the input data, rather than using the current region.
..
> On a related note, r.in.xyz should probably have some way to
> explicitly specify the bounds and resolution of the created map,
> either separate n=/s=/e=/w=/res= options and/or a region= option
> to use a named region.

I hear your point re. standardizing r.in.* modules, but r.in.xyz is not
like other import modules and I do not think it should work that way. It
is as much a binning filter as a data importer; the XYZ data is not a
finished product to be loaded into the GIS without data loss. The module
is by design a data aggregator, ie lossy. It is not loading the data raw &
intact, it is processing it into something new.

here are some screenshots of a recent use:
  http://bambi.otago.ac.nz/hamish/grass/r.in.xyz/saunders_track.jpg
  http://bambi.otago.ac.nz/hamish/grass/r.in.xyz/saunders_bathy.png
  (with thanks to Bob Covill for the idea)

XYZ from the ship's combined NMEA lat/lon+depth sounder is processed for
an area of interest. In the image with the chart you can see the ship's
track (red) continues away many miles from the survey site back to port.
Loading in at rough resolution then using d.zoom to interactively set the
bounds is way easier than trying to use d.where + manually set 'n= s= w=
e='. And 'res=' is another thing altogether, but being able to use
'g.region res= -a' really helps in a way which would be a pain to
calculate manually.

With a LIDAR study with 40GB of raw x,y,z ASCII results is probably more
important to not try and load the entire spatial coverage in at once.

So I would argue that not using the full spatial extent of the XYZ data is
the typical mode of operation for this module, and using d.zoom &
'g.region res= -a' after getting the rough coordinates with 'r.in.xyz -s'
and then doing a rough first pass is the recommended workflow.

In summary, it's a data aggregator not just a data importer and I'm
leaning to "won't fix".

Hamish

--
Ticket URL: <http://trac.osgeo.org/grass/ticket/37#comment:4&gt;
GRASS GIS <http://grass.osgeo.org>
GRASS Geographic Information System (GRASS GIS) - http://grass.osgeo.org/

Maris Nartiss wrote:

Moving discussion into more general plane.
Is there some document where general GRASS module policies are
described? If such document does not exist, probably we should create
one (programmers manual? trac wiki?). Such document could contain
short must/should/suggested things to unify/make similar various
module behavior. It could be used by developers while developing new
modules or cleaning/fixing existing ones.
Some candidates (just examples): "Scripts should not touch WIND file
but use $WIND_OVERRIDE instead"; "r.in.* modules should import data
in their native resolution (if possible) ignoring current region
settings. They may provide region controlled import mode activated by
-? flag" etc.

Easier said than done. It's a lot easier to notice that specific
behaviour is unusual than to enumerate the set of unusual behaviours.

--
Glynn Clements <glynn@gclements.plus.com>

GRASS GIS wrote:

#37: r.in.xyz increase region based on input data
--------------------------+-------------------------------------------------
  Reporter: marisn | Owner: hamish
      Type: enhancement | Status: assigned
  Priority: minor | Milestone: 6.4.0
Component: default | Version: unspecified
Resolution: | Keywords:
--------------------------+-------------------------------------------------
Comment (by hamish):

Glynn:
> What it probably *should* do is to create a map whose size is
> based upon the input data, rather than using the current region.
..
> On a related note, r.in.xyz should probably have some way to
> explicitly specify the bounds and resolution of the created map,
> either separate n=/s=/e=/w=/res= options and/or a region= option
> to use a named region.

I hear your point re. standardizing r.in.* modules, but r.in.xyz is not
like other import modules and I do not think it should work that way.

Then it's debatable whether it should be named r.in.<something>.

--
Glynn Clements <glynn@gclements.plus.com>

#37: r.in.xyz increase region based on input data
--------------------------+-------------------------------------------------
  Reporter: marisn | Owner: hamish
      Type: enhancement | Status: closed
  Priority: minor | Milestone: 6.4.0
Component: default | Version: unspecified
Resolution: wontfix | Keywords:
--------------------------+-------------------------------------------------
Changes (by hamish):

  * status: assigned => closed
  * resolution: => wontfix

Comment:

On grass-dev Glynn suggested that perhaps it shouldn't be called "r.in.*"
then as it acts differently to other import modules.

Well it does import data, it just aggregates it as it does that, so it's a
hybrid. Thus I can argue it both ways depending on context :wink:

Closing this wish as "wontfix" as the base module should not permanently
change the region settings.

As doing that can be a logical part of the workflow, it would be nice if
Eric uploaded his script to the wiki add-ons section or if it's just a
few-liner job perhaps a new page about using r.in.xyz in the wiki
somewhere or a new example in the help page?

Hamish

--
Ticket URL: <http://trac.osgeo.org/grass/ticket/37#comment:5&gt;
GRASS GIS <http://grass.osgeo.org>
GRASS Geographic Information System (GRASS GIS) - http://grass.osgeo.org/

#37: r.in.xyz increase region based on input data
--------------------------+-------------------------------------------------
  Reporter: marisn | Owner: hamish
      Type: enhancement | Status: reopened
  Priority: minor | Milestone: 6.4.0
Component: default | Version: unspecified
Resolution: | Keywords:
--------------------------+-------------------------------------------------
Changes (by glynn):

  * status: closed => reopened
  * resolution: wontfix =>

Comment:

Re-opening. An option to create a map based upon the bounds of the data
rather than the current region is (IMHO) a reasonable enhancement to
r.in.xyz. It shouldn't need a separate script.

For enhancements, wontfix should be reserved for "this is a bad idea"
(e.g. bloat) rather than simply "won't be implemented immediately".

--
Ticket URL: <http://trac.osgeo.org/grass/ticket/37#comment:6&gt;
GRASS GIS <http://grass.osgeo.org>
GRASS Geographic Information System (GRASS GIS) - http://grass.osgeo.org/

#37: r.in.xyz increase region based on input data
--------------------------+-------------------------------------------------
  Reporter: marisn | Owner: hamish
      Type: enhancement | Status: reopened
  Priority: minor | Milestone: 6.4.0
Component: default | Version: unspecified
Resolution: | Keywords:
--------------------------+-------------------------------------------------
Comment (by hamish):

Glynn:
> An option to create a map based upon the bounds of the data rather than
the
> current region is (IMHO) a reasonable enhancement to r.in.xyz. It
shouldn't
> need a separate script.

It would be possible to automatically set the bounds for the new map after
scanning the data, but as there is no associated resolution in the data
you have to use whatever the current region settings are, and then code in
a 'g.region -a' approach or not. At which point before running the module
for the final product you have to scan the data and run 'g.region res= -a'
and spend a little time thinking about the region size anyway. Or if you
don't want to code in -a (it's problematic enough in g.region already), be
content with it automatically using some haywire resolution setting, which
I am not. The bin size is a critical part of what the module does and
fundamental to the meaning of the results.

Hamish

--
Ticket URL: <http://trac.osgeo.org/grass/ticket/37#comment:7&gt;
GRASS GIS <http://grass.osgeo.org>
GRASS Geographic Information System (GRASS GIS) - http://grass.osgeo.org/

#37: r.in.xyz increase region based on input data
--------------------------+-------------------------------------------------
  Reporter: marisn | Owner: hamish
      Type: enhancement | Status: reopened
  Priority: minor | Milestone: 6.4.0
Component: default | Version: unspecified
Resolution: | Keywords:
--------------------------+-------------------------------------------------
Comment (by glynn):

Replying to [comment:7 hamish]:

> It would be possible to automatically set the bounds for the new map
after
> scanning the data, but as there is no associated resolution in the data
> you have to use whatever the current region settings are, and then code
in
> a 'g.region -a' approach or not.

That just means aligning the bounds to the existing grid (resolution
and offset).

> At which point before running the module
> for the final product you have to scan the data and run 'g.region res=
-a'
> and spend a little time thinking about the region size anyway. Or if
you
> don't want to code in -a (it's problematic enough in g.region already),

The only things that are problematic about g.region -a are:

1. The existing behaviour is counterintuitive. That mostly derives
from the fact that you *might* be changing the resolution at the same
time, in which case you have to choose the offset arbitrarily.

2. Any change to the eixting behaviour creates an incompatibility with
previous versions.

An enhancement to r.in.xyz wouldn't have either of these problems.
There is no reason to change the resolution, and there is no existing
behaviour to remain compatible with.

--
Ticket URL: <http://trac.osgeo.org/grass/ticket/37#comment:8&gt;
GRASS GIS <http://grass.osgeo.org>
GRASS Geographic Information System (GRASS GIS) - http://grass.osgeo.org/

#37: r.in.xyz increase region based on input data
--------------------------+-------------------------------------------------
  Reporter: marisn | Owner: hamish
      Type: enhancement | Status: reopened
  Priority: minor | Milestone: 6.4.0
Component: default | Version: unspecified
Resolution: | Keywords:
--------------------------+-------------------------------------------------
Comment (by neteler):

I post again to trac:

Real world example which will a problem for the average user:

{{{
GRASS 6.3.0svn (pat): > r.in.xyz d325095655/p325099657.txt
out=`basename d325095655/p325099657.txt` fs=space
Scanning data ...
Writing to map ...
  100%
r.in.xyz complete. 0 points found in region.
}}}

(yes, there are points in the file)

All GRASS import programs actually import the map without the need
that the user defines the boundary. Here, the user has to preset the
region
(extra step) and then s/he can import.

Since we cannot change to much in GRASS 6, a flag to avoid this extra step
would be nice, something like:

{{{
-e Scan data file and import complete map
}}}

I am asking because I have to process *many* maps. Sure, I can easily
write a shell script for that but other users maybe not. Also for the sake
of consistency... or rename the module which doesn't make much sense.

Markus

--
Ticket URL: <http://trac.osgeo.org/grass/ticket/37#comment:9&gt;
GRASS GIS <http://grass.osgeo.org>
GRASS Geographic Information System (GRASS GIS) - http://grass.osgeo.org/

#37: r.in.xyz increase region based on input data
--------------------------+-------------------------------------------------
  Reporter: marisn | Owner: hamish
      Type: enhancement | Status: reopened
  Priority: minor | Milestone: 6.4.0
Component: default | Version: unspecified
Resolution: | Keywords:
--------------------------+-------------------------------------------------
Comment (by hamish):

I will write a demonstration r.in.xyz.auto script providing the desired
functionality wrt bounds so we can test how other methods could work.

However, for this module picking your resolution (thus average number of
input points per cell bin) is *absolutely fundamental* to results and must
be chosen with care and supervision. This probably means a few trial
'r.in.xyz method=n' + 'r.univar n_map' steps to get it right. This is
unknowable information from the data which must be supplied by the user,
and the workflow is multi-step.

Solutions:
  * We could use the current region resolution with bounds taken from
scanning step. In this case the user is 90% likely not to have considered
the resolution first and will get bad results or a out-of-memory error as
the resolution will be badly wrong. Also it means using a g.region step
first anyway, so we're not really that much better off.

  * We could add a res= option for use with an optional auto-bounds setting
flag. I'm not very excited about adding new options and flags, but it's a
possibility. 'g.region -a' like code would need into the module and offset
region would be impossible (s=25 n=75 res=50).

  * We could choose a default resolution that made the output like a
1000x1000 map. Lame.

I think that all these solution are to some extent poor and would be
little more consistent with other r.in.* modules than the current way. In
any case import from auto-scanned bounds should not be changed to be the
default, and custom subregion and resolution must remain possible. In
practice I have found that a cropping step is generally wanted, so it is
useful to combine it with this first import step.

Re consistency we just have to note that the module works on the current
region like other r. modules and not like other r.in. modules.

I would also note that this is all well discussed in the man page, with an
example.

still unconvinced,
Hamish

--
Ticket URL: <http://trac.osgeo.org/grass/ticket/37#comment:10&gt;
GRASS GIS <http://grass.osgeo.org>
GRASS Geographic Information System (GRASS GIS) - http://grass.osgeo.org/

#37: r.in.xyz increase region based on input data
--------------------------+-------------------------------------------------
  Reporter: marisn | Owner: hamish
      Type: enhancement | Status: reopened
  Priority: minor | Milestone: 6.4.0
Component: default | Version: unspecified
Resolution: | Keywords:
--------------------------+-------------------------------------------------
Comment (by hamish):

r.in.xyz.auto script:
   http://trac.osgeo.org/grass/browser/grass-addons/raster/r.in.xyz.auto

Hamish

--
Ticket URL: <http://trac.osgeo.org/grass/ticket/37#comment:11&gt;
GRASS GIS <http://grass.osgeo.org>
GRASS Geographic Information System (GRASS GIS) - http://grass.osgeo.org/

#37: r.in.xyz increase region based on input data
--------------------------+-------------------------------------------------
  Reporter: marisn | Owner: hamish
      Type: enhancement | Status: reopened
  Priority: minor | Milestone: 6.4.0
Component: Raster | Version: unspecified
Resolution: | Keywords: r.in.xyz
  Platform: Unspecified | Cpu: Unspecified
--------------------------+-------------------------------------------------
Changes (by martinl):

  * keywords: => r.in.xyz
  * platform: => Unspecified
  * component: default => Raster
  * cpu: => Unspecified

--
Ticket URL: <http://trac.osgeo.org/grass/ticket/37#comment:12&gt;
GRASS GIS <http://grass.osgeo.org>

#37: r.in.xyz increase region based on input data
--------------------------+-------------------------------------------------
  Reporter: marisn | Owner: hamish
      Type: enhancement | Status: reopened
  Priority: minor | Milestone: 6.5.0
Component: Raster | Version: unspecified
Resolution: | Keywords: r.in.xyz
  Platform: Unspecified | Cpu: Unspecified
--------------------------+-------------------------------------------------
Changes (by martinl):

  * milestone: 6.4.0 => 6.5.0

--
Ticket URL: <http://trac.osgeo.org/grass/ticket/37#comment:13&gt;
GRASS GIS <http://grass.osgeo.org>

#37: r.in.xyz increase region based on input data
--------------------------+-------------------------------------------------
  Reporter: marisn | Owner: hamish
      Type: enhancement | Status: closed
  Priority: minor | Milestone: 6.5.0
Component: Raster | Version: unspecified
Resolution: fixed | Keywords: r.in.xyz
  Platform: Unspecified | Cpu: Unspecified
--------------------------+-------------------------------------------------
Changes (by hamish):

  * status: reopened => closed
  * resolution: => fixed

Comment:

script in addons for years.

--
Ticket URL: <https://trac.osgeo.org/grass/ticket/37#comment:14&gt;
GRASS GIS <http://grass.osgeo.org>