[GRASS-dev] v.cellstats script

Hello,

Carlos Grohmann and I created a script as substitute for v.cellstats. It would
be great if somebody can look through it to give feed-back how to improve it
before we post it to the WIKI.

regards, Martin

(attachments)

v.cellstats2 (2.59 KB)

Martin Wegmann wrote on 11/17/2006 02:34 PM:

Hello,

Carlos Grohmann and I created a script as substitute for v.cellstats. It would
be great if somebody can look through it to give feed-back how to improve it
before we post it to the WIKI.

regards,

Hi Martin, Carlos,

two quick observations:

This is a bash-ism AFAIK (still needs to be eliminated in some scripts):
   export LC_NUMERIC=C

Before doing that the current region should be saved. In an older mail
from Glynn it was described how to solve it in an elegant way:
   #set to current input map region
   g.region rast=$GIS_OPT_raster

Of course the current region should be restored at the end of the script
and in case of "trap" (which isn't there, see cleanup() in other scripts).

Markus

On Friday 17 November 2006 16:55, Markus Neteler wrote:

Martin Wegmann wrote on 11/17/2006 02:34 PM:
> Hello,
>
> Carlos Grohmann and I created a script as substitute for v.cellstats. It
> would be great if somebody can look through it to give feed-back how to
> improve it before we post it to the WIKI.
>
> regards,

Hi Martin, Carlos,

two quick observations:

This is a bash-ism AFAIK (still needs to be eliminated in some scripts):
   export LC_NUMERIC=C

Before doing that the current region should be saved. In an older mail
from Glynn it was described how to solve it in an elegant way:
   #set to current input map region
   g.region rast=$GIS_OPT_raster

Of course the current region should be restored at the end of the script
and in case of "trap" (which isn't there, see cleanup() in other scripts).

thanks Markus, I'll will fix it -- in the meantime I found a bug with the
g.region settings for the coarser raster - only the resolution setting works
otherwise variance is 0.

Martin

Markus

_______________________________________________
grass-dev mailing list
grass-dev@grass.itc.it
http://grass.itc.it/mailman/listinfo/grass-dev

Markus Neteler wrote:

Martin Wegmann wrote on 11/17/2006 02:34 PM:

two quick observations:

This is a bash-ism AFAIK (still needs to be eliminated in some scripts):
   export LC_NUMERIC=C

Before doing that the current region should be saved. In an older mail
from Glynn it was described how to solve it in an elegant way:
   #set to current input map region
   g.region rast=$GIS_OPT_raster

You mustn't change the current region. Use

# set the region to a raster but don't touch the WIND
g.region raster=$GIS_OPT_raster save=$region -a -u # mind the -u !
WIND_OVERRIDE=$region
export WIND_OVERRIDE

instead.

Maciek

Martin Wegmann wrote:

Carlos Grohmann and I created a script as substitute for v.cellstats.
It would be great if somebody can look through it to give feed-back
how to improve it before we post it to the WIKI.

Hi Martin,

how does this script's function differ from the new r.resamp.stats
module?

script notes: (many are general comments for all grass scripts)

* _v._cellstats, but it imports and exports raster maps!

* use standard input= and output= option names for consistency.
  make input= the first option

* output maps should use the "new,cell,raster". This way --overwrite
works, etc
#% key: raster
#% gisprompt: old,cell,raster

* for method key, use "options:" line to specify available methods.
#% options: n,min,max,range,sum,mean,stddev,variance,coeff_var

* redirect status messages to stderr: (helps parsing output to a file)
echo "You must be in GRASS GIS to run this program" 1>&2

* export cell x,y,z to file directly:
was:
  #convert to vect and export as ascii
  r.to.vect -z input=$GIS_OPT_raster output=tmp_vect feature=point
  v.out.ascii input=tmp_vect output=xyz_file format=point
easier:
  r.stats -1ng "$GIS_OPT_raster" > xyz_file

(as this is impossible to guess without prior experience, I've just
written a simple wrapper script called r.out.xyz, now in 6.3cvs.
maybe better as an option to r.out.ascii?)

* use "r.to.vect -b" so the script can work with larger grids
(1500x1500) input -> 2.25million vector point & memory problems
if topology is built. Also, r.in.xyz has to keep the entire map in
memory, so hard-coding percent=100 for Very large grids could cause
problems.

* use g.tempfile (script won't work if in a non writeable dir, eg "/")
see grass scripts/ for an example of doing that safely
(e.g. r.univar.sh)

* double quote all "$VARIABLES". legal map names may include restricted
shell chars, filenames may include spaces.

cheers,
Hamish

hello Hamish,

On Saturday 18 November 2006 08:11, Hamish wrote:

Martin Wegmann wrote:
> Carlos Grohmann and I created a script as substitute for v.cellstats.
> It would be great if somebody can look through it to give feed-back
> how to improve it before we post it to the WIKI.

[...]

how does this script's function differ from the new r.resamp.stats
module?

yes it looks the same and it's much faster, but it variance is missing.

I did a test with v.cellstats2 and r.resamp.stats (+ r.to.vect + g.region +
v.surf.rst) using spearfish and the result in quite the same.

Perhaps it is worth to write a script v(r).cellstats with r.resamp.stats +
subsequent commands to receive interpolated high res. results.

script notes: (many are general comments for all grass scripts)

* _v._cellstats, but it imports and exports raster maps!

yes, you are right r.cellstats would be better. we named it v.* just because
we wanted to provide a subsitute for v.cellstats

thanks for your comments below, I will work through them and modify script.

cheers, Martin

* use standard input= and output= option names for consistency.
  make input= the first option

* output maps should use the "new,cell,raster". This way --overwrite
works, etc
#% key: raster
#% gisprompt: old,cell,raster

* for method key, use "options:" line to specify available methods.
#% options: n,min,max,range,sum,mean,stddev,variance,coeff_var

* redirect status messages to stderr: (helps parsing output to a file)
echo "You must be in GRASS GIS to run this program" 1>&2

* export cell x,y,z to file directly:
was:
  #convert to vect and export as ascii
  r.to.vect -z input=$GIS_OPT_raster output=tmp_vect feature=point
  v.out.ascii input=tmp_vect output=xyz_file format=point
easier:
  r.stats -1ng "$GIS_OPT_raster" > xyz_file

(as this is impossible to guess without prior experience, I've just
written a simple wrapper script called r.out.xyz, now in 6.3cvs.
maybe better as an option to r.out.ascii?)

* use "r.to.vect -b" so the script can work with larger grids
(1500x1500) input -> 2.25million vector point & memory problems
if topology is built. Also, r.in.xyz has to keep the entire map in
memory, so hard-coding percent=100 for Very large grids could cause
problems.

* use g.tempfile (script won't work if in a non writeable dir, eg "/")
see grass scripts/ for an example of doing that safely
(e.g. r.univar.sh)

* double quote all "$VARIABLES". legal map names may include restricted
shell chars, filenames may include spaces.

cheers,
Hamish

_______________________________________________
grass-dev mailing list
grass-dev@grass.itc.it
http://grass.itc.it/mailman/listinfo/grass-dev

Hi Hamish, Martin:

On Monday 20 November 2006 06:57, Martin Wegmann wrote:

hello Hamish,

On Saturday 18 November 2006 08:11, Hamish wrote:
> Martin Wegmann wrote:
> > Carlos Grohmann and I created a script as substitute for v.cellstats.
> > It would be great if somebody can look through it to give feed-back
> > how to improve it before we post it to the WIKI.

[...]

> how does this script's function differ from the new r.resamp.stats
> module?

yes it looks the same and it's much faster, but it variance is missing.

Great, always on the lookout for modules that aid with *scaling* operations.
Any benchmarks?

I did a test with v.cellstats2 and r.resamp.stats (+ r.to.vect + g.region +
v.surf.rst) using spearfish and the result in quite the same.

Interpolation ? Was this for a downscaling operation? i.e. large cell size to
small cell size?

Perhaps it is worth to write a script v(r).cellstats with r.resamp.stats +
subsequent commands to receive interpolated high res. results.

Not sure how to interpret this.

Once the suggested changes have been made, I would be willing to test
aggregation with the new modules against the two other methods that I have
recently used:
1. starspan
2. r.resamp.stats

link to tests here:
http://casoilresource.lawr.ucdavis.edu/drupal/node/275

Cheers,

Dylan

> script notes: (many are general comments for all grass scripts)
>
>
> * _v._cellstats, but it imports and exports raster maps!

yes, you are right r.cellstats would be better. we named it v.* just
because we wanted to provide a subsitute for v.cellstats

thanks for your comments below, I will work through them and modify script.

cheers, Martin

> * use standard input= and output= option names for consistency.
> make input= the first option
>
> * output maps should use the "new,cell,raster". This way --overwrite
> works, etc
> #% key: raster
> #% gisprompt: old,cell,raster
>
> * for method key, use "options:" line to specify available methods.
> #% options: n,min,max,range,sum,mean,stddev,variance,coeff_var
>
> * redirect status messages to stderr: (helps parsing output to a file)
> echo "You must be in GRASS GIS to run this program" 1>&2
>
> * export cell x,y,z to file directly:
> was:
> #convert to vect and export as ascii
> r.to.vect -z input=$GIS_OPT_raster output=tmp_vect feature=point
> v.out.ascii input=tmp_vect output=xyz_file format=point
> easier:
> r.stats -1ng "$GIS_OPT_raster" > xyz_file
>
> (as this is impossible to guess without prior experience, I've just
> written a simple wrapper script called r.out.xyz, now in 6.3cvs.
> maybe better as an option to r.out.ascii?)
>
> * use "r.to.vect -b" so the script can work with larger grids
> (1500x1500) input -> 2.25million vector point & memory problems
> if topology is built. Also, r.in.xyz has to keep the entire map in
> memory, so hard-coding percent=100 for Very large grids could cause
> problems.
>
> * use g.tempfile (script won't work if in a non writeable dir, eg "/")
> see grass scripts/ for an example of doing that safely
> (e.g. r.univar.sh)
>
> * double quote all "$VARIABLES". legal map names may include restricted
> shell chars, filenames may include spaces.
>
>
> cheers,
> Hamish
>
> _______________________________________________
> grass-dev mailing list
> grass-dev@grass.itc.it
> http://grass.itc.it/mailman/listinfo/grass-dev

_______________________________________________
grass-dev mailing list
grass-dev@grass.itc.it
http://grass.itc.it/mailman/listinfo/grass-dev

--
Dylan Beaudette
Soils and Biogeochemistry Graduate Group
University of California at Davis
530.754.7341

On Monday 20 November 2006 15:57, Martin Wegmann wrote:
[...]

> how does this script's function differ from the new r.resamp.stats
> module?

yes it looks the same and it's much faster, but it variance is missing.

I added in the main.c file of r.resamp.stats these lines to be able to
calculate variance and stddev:

[..]
  {c_sum, w_sum, "sum", "sum of values"},
  {c_var, w_var, "variance", "variance value"},
  {c_stddev, w_stddev, "stddev", "standard deviation"},
  {NULL, NULL, NULL}
[...]

it seems to work so far - if that is ok, shall I submit it to cvs?

Martin

[...]

Martin Wegmann wrote:

> > how does this script's function differ from the new r.resamp.stats
> > module?
>
> yes it looks the same and it's much faster, but it variance is missing.

I added in the main.c file of r.resamp.stats these lines to be able to
calculate variance and stddev:

[..]
  {c_sum, w_sum, "sum", "sum of values"},
  {c_var, w_var, "variance", "variance value"},
  {c_stddev, w_stddev, "stddev", "standard deviation"},
  {NULL, NULL, NULL}
[...]

it seems to work so far - if that is ok, shall I submit it to cvs?

What's the point?

BTW, sum is already there. I added it after it was pointed out that it
might be useful: if the values in the source map are the number of
"items" in each source cell, then the sum aggregate makes sense, as
the result will be the number of "items" in the output cell.

OTOH, I don't really see the point of computing stddev/variance over
the values which happen to fall into a given output cell. That's why I
chose to omit them (and initially to omit sum).

[The original version only included the aggregates which produced a
result which was "representative" of the input values (i.e. those
where, if the input cells all had the same value, the output would be
that value).]

--
Glynn Clements <glynn@gclements.plus.com>

Hello Glynn,

On Tuesday 21 November 2006 22:41, you wrote:

Martin Wegmann wrote:
> > > how does this script's function differ from the new r.resamp.stats
> > > module?
> >
> > yes it looks the same and it's much faster, but it variance is missing.
>
> I added in the main.c file of r.resamp.stats these lines to be able to
> calculate variance and stddev:
>
> [..]
> {c_sum, w_sum, "sum", "sum of values"},
> {c_var, w_var, "variance", "variance value"},
> {c_stddev, w_stddev, "stddev", "standard deviation"},
> {NULL, NULL, NULL}
> [...]
>
> it seems to work so far - if that is ok, shall I submit it to cvs?

What's the point?

BTW, sum is already there. I added it after it was pointed out that it
might be useful: if the values in the source map are the number of
"items" in each source cell, then the sum aggregate makes sense, as
the result will be the number of "items" in the output cell.

OTOH, I don't really see the point of computing stddev/variance over
the values which happen to fall into a given output cell. That's why I
chose to omit them (and initially to omit sum).

we wanted to know how much data is lost in a low res. map compared to a high
res. map hence we are interested in the variance presented inside one low
res. pixel of various high res. pixel.

well stddev can be omitted but so far I have some use for variance - but I can
keep my modified local version and work with it, thats fine for me. ,-)

cheers Martin

[The original version only included the aggregates which produced a
result which was "representative" of the input values (i.e. those
where, if the input cells all had the same value, the output would be
that value).]

Martin Wegmann wrote:

> > > > how does this script's function differ from the new r.resamp.stats
> > > > module?
> > >
> > > yes it looks the same and it's much faster, but it variance is missing.
> >
> > I added in the main.c file of r.resamp.stats these lines to be able to
> > calculate variance and stddev:
> >
> > [..]
> > {c_sum, w_sum, "sum", "sum of values"},
> > {c_var, w_var, "variance", "variance value"},
> > {c_stddev, w_stddev, "stddev", "standard deviation"},
> > {NULL, NULL, NULL}
> > [...]
> >
> > it seems to work so far - if that is ok, shall I submit it to cvs?
>
> What's the point?
>
> BTW, sum is already there. I added it after it was pointed out that it
> might be useful: if the values in the source map are the number of
> "items" in each source cell, then the sum aggregate makes sense, as
> the result will be the number of "items" in the output cell.
>
> OTOH, I don't really see the point of computing stddev/variance over
> the values which happen to fall into a given output cell. That's why I
> chose to omit them (and initially to omit sum).

we wanted to know how much data is lost in a low res. map compared to a high
res. map hence we are interested in the variance presented inside one low
res. pixel of various high res. pixel.

well stddev can be omitted but so far I have some use for variance - but I can
keep my modified local version and work with it, thats fine for me. ,-)

Maybe this should be an additional option, to allow a variance map to
be generated in addition to the resampled map. E.g.
"r.resamp.stats method=... variance=..."?

If that were done, would it be better to compute the deviation from
the mean (i.e. variance) or the deviation from the result (in case an
aggregate other than the mean is used)?

--
Glynn Clements <glynn@gclements.plus.com>