[GRASSLIST:9207] modifications to v.rast.stats

Hey folks,

  In order to speed up the v.rast.stats script, I've made some changes
that might be of interest. I'm relatively new to GRASS so let me know
if there is a better way to do this...

The main issue was, instead of running r.univar multiple times for
each statistic, I run it once, save to a temp file, then cat the
results for each statistic...

$ diff v.rast.stats v.rast.stats.old

< # Run once only and save to temp file
< r.univar -g $RASTER > .${TMPNAME}.univar
< n=`cat .${TMPNAME}.univar | cut -d'=' -f2 | tr '\n' ' ' | cut -d' ' -f1`
< min=`cat .${TMPNAME}.univar | cut -d'=' -f2 | tr '\n' ' ' | cut -d' ' -f2`
< max=`cat .${TMPNAME}.univar | cut -d'=' -f2 | tr '\n' ' ' | cut -d' ' -f3`
< range=`cat .${TMPNAME}.univar | cut -d'=' -f2 | tr '\n' ' ' | cut -d' ' -f4`
< mean=`cat .${TMPNAME}.univar | cut -d'=' -f2 | tr '\n' ' ' | cut -d' ' -f5`
< stddev=`cat .${TMPNAME}.univar | cut -d'=' -f2 | tr '\n' ' ' | cut -d' ' -f6`
< variance=`cat .${TMPNAME}.univar | cut -d'=' -f2 | tr '\n' ' ' |
cut -d' ' -f7`
< cf_var=`cat .${TMPNAME}.univar | cut -d'=' -f2 | tr '\n' ' ' | cut -d' ' -f8`
< sum=`cat .${TMPNAME}.univar | cut -d'=' -f2 | tr '\n' ' ' | cut -d' ' -f9`
---

n=`r.univar -g $RASTER | cut -d'=' -f2 | tr '\n' ' ' | cut -d' ' -f1`
min=`r.univar -g $RASTER | cut -d'=' -f2 | tr '\n' ' ' | cut -d' ' -f2`
max=`r.univar -g $RASTER | cut -d'=' -f2 | tr '\n' ' ' | cut -d' ' -f3`
range=`r.univar -g $RASTER | cut -d'=' -f2 | tr '\n' ' ' | cut -d' ' -f4`
mean=`r.univar -g $RASTER | cut -d'=' -f2 | tr '\n' ' ' | cut -d' ' -f5`
stddev=`r.univar -g $RASTER | cut -d'=' -f2 | tr '\n' ' ' | cut -d' ' -f6`
variance=`r.univar -g $RASTER | cut -d'=' -f2 | tr '\n' ' ' | cut -d' ' -f7`
cf_var=`r.univar -g $RASTER | cut -d'=' -f2 | tr '\n' ' ' | cut -d' ' -f8`
sum=`r.univar -g $RASTER | cut -d'=' -f2 | tr '\n' ' ' | cut -d' ' -f9`

Of course, in case the user breaks out of the script, we have to add
this to the cleanup() function:

< rm .${TMPNAME}.univar

I also noticed that it runs the whole analysis at the full extents of
the raster map. Since we're looping through categories and creating a
raster mask, we only need to be worried about the extents of that
particular vector category.

I'm not sure if this is the most elegant way to solve the issue but I
create a temp vector map of the particular category and set the region
to this vector. This speeds up the creation of the raster mask for
cases when you have very large rasters and lots of smaller polygons.

<
< # Extract the current category and set region
< v.extract input=$VECTOR output=${VECTOR}_CAT${i} list=$i >
/dev/null 2> /dev/null
< g.region vect=${VECTOR}_CAT${i} > /dev/null 2> /dev/null
<
< # Make certain we are using the proper resolution
< g.region nsres=$NSRES ewres=$EWRES -ap > /dev/null
<
< #Remove temporary vector
< g.remove vect=${VECTOR}_CAT${i} > /dev/null 2> /dev/null
<
< # generate mask .....

The one problem is that if a certain vector category extent is
smaller than the current cellsize, g.region fails. Any ideas on a more
elegant way?

For the record, this reduced operation time on my project from about
240 seconds per vector category to an average of 9 seconds.

--
Matt Perry
National Center for Ecological Analysis and Synthesis
University of California, Santa Barbara
perrygeo@gmail.com

I did the same thing to my copy (resize region). In addition, I
stopped using v.db.update to update values in sqlite3. Instead I set
up an SQL transaction in a temp file and then piped all of the updates
into sqlite3 as a single operation. Saves about 3 seconds per category
on my machine (adds up if you have a lot of them).

David

--
David Finlayson
Marine Geology & Geophysics
School of Oceanography
Box 357940
University of Washington
Seattle, WA 98195-7940
USA

Office: Marine Sciences Building, Room 112
Phone: (206) 616-9407
Web: http://students.washington.edu/dfinlays

Matthew,

today I have fixed a couple of things in v.rast.stats.
The r.univar trick is implemented as well.

Would you mind to integrate the suggested v.extract trick
into the current CVS version?

Thanks

markus

On Wed, Nov 23, 2005 at 12:06:52PM -0800, Matthew Perry wrote:

Hey folks,

  In order to speed up the v.rast.stats script, I've made some changes
that might be of interest. I'm relatively new to GRASS so let me know
if there is a better way to do this...

The main issue was, instead of running r.univar multiple times for
each statistic, I run it once, save to a temp file, then cat the
results for each statistic...

$ diff v.rast.stats v.rast.stats.old

< # Run once only and save to temp file
< r.univar -g $RASTER > .${TMPNAME}.univar
< n=`cat .${TMPNAME}.univar | cut -d'=' -f2 | tr '\n' ' ' | cut -d' ' -f1`
< min=`cat .${TMPNAME}.univar | cut -d'=' -f2 | tr '\n' ' ' | cut -d' ' -f2`
< max=`cat .${TMPNAME}.univar | cut -d'=' -f2 | tr '\n' ' ' | cut -d' ' -f3`
< range=`cat .${TMPNAME}.univar | cut -d'=' -f2 | tr '\n' ' ' | cut -d' ' -f4`
< mean=`cat .${TMPNAME}.univar | cut -d'=' -f2 | tr '\n' ' ' | cut -d' ' -f5`
< stddev=`cat .${TMPNAME}.univar | cut -d'=' -f2 | tr '\n' ' ' | cut -d' ' -f6`
< variance=`cat .${TMPNAME}.univar | cut -d'=' -f2 | tr '\n' ' ' |
cut -d' ' -f7`
< cf_var=`cat .${TMPNAME}.univar | cut -d'=' -f2 | tr '\n' ' ' | cut -d' ' -f8`
< sum=`cat .${TMPNAME}.univar | cut -d'=' -f2 | tr '\n' ' ' | cut -d' ' -f9`
---
> n=`r.univar -g $RASTER | cut -d'=' -f2 | tr '\n' ' ' | cut -d' ' -f1`
> min=`r.univar -g $RASTER | cut -d'=' -f2 | tr '\n' ' ' | cut -d' ' -f2`
> max=`r.univar -g $RASTER | cut -d'=' -f2 | tr '\n' ' ' | cut -d' ' -f3`
> range=`r.univar -g $RASTER | cut -d'=' -f2 | tr '\n' ' ' | cut -d' ' -f4`
> mean=`r.univar -g $RASTER | cut -d'=' -f2 | tr '\n' ' ' | cut -d' ' -f5`
> stddev=`r.univar -g $RASTER | cut -d'=' -f2 | tr '\n' ' ' | cut -d' ' -f6`
> variance=`r.univar -g $RASTER | cut -d'=' -f2 | tr '\n' ' ' | cut -d' ' -f7`
> cf_var=`r.univar -g $RASTER | cut -d'=' -f2 | tr '\n' ' ' | cut -d' ' -f8`
> sum=`r.univar -g $RASTER | cut -d'=' -f2 | tr '\n' ' ' | cut -d' ' -f9`

Of course, in case the user breaks out of the script, we have to add
this to the cleanup() function:

< rm .${TMPNAME}.univar

I also noticed that it runs the whole analysis at the full extents of
the raster map. Since we're looping through categories and creating a
raster mask, we only need to be worried about the extents of that
particular vector category.

I'm not sure if this is the most elegant way to solve the issue but I
create a temp vector map of the particular category and set the region
to this vector. This speeds up the creation of the raster mask for
cases when you have very large rasters and lots of smaller polygons.

<
< # Extract the current category and set region
< v.extract input=$VECTOR output=${VECTOR}_CAT${i} list=$i >
/dev/null 2> /dev/null
< g.region vect=${VECTOR}_CAT${i} > /dev/null 2> /dev/null
<
< # Make certain we are using the proper resolution
< g.region nsres=$NSRES ewres=$EWRES -ap > /dev/null
<
< #Remove temporary vector
< g.remove vect=${VECTOR}_CAT${i} > /dev/null 2> /dev/null
<
< # generate mask .....

The one problem is that if a certain vector category extent is
smaller than the current cellsize, g.region fails. Any ideas on a more
elegant way?

For the record, this reduced operation time on my project from about
240 seconds per vector category to an average of 9 seconds.

--
Matt Perry
National Center for Ecological Analysis and Synthesis
University of California, Santa Barbara
perrygeo@gmail.com

--
Markus Neteler <neteler itc it> http://mpa.itc.it
ITC-irst - Centro per la Ricerca Scientifica e Tecnologica
MPBA - Predictive Models for Biol. & Environ. Data Analysis
Via Sommarive, 18 - 38050 Povo (Trento), Italy