Giovanni wrote:
Yesterday I needed to use v.rast.stats on a 1793 areas covering a
4415x6632 raster (with resolution 50m/pixel). I've used it without
extended statistics but the processing time was, with an euphemism,
very very long.
....
I've tried to investigate what was going wrong, the bottleneck, but
at the end I suppose that it's a problem of the script itself (the
looping chain of r.mapcalc and r.univar, the creation and deletion
of the MASK in each loop).
right, it's very inefficient.
FWIW, I had done something similar to v.rast.stats a while back. To
speed it up I used g.region to zoom in on the area of interest so the
r.mapcalc/r.univar step didn't have to run over the mostly NULL map*.
It helped a lot.
[*] IIUC if the entire row is NULL grass knows to skip the entire row
quickly and not test each cell, which makes processing maps with lots
of NULLs rather fast.
You can look at g.region.point and v.what.rast.buffer in the wiki addons
for some cleaned up version of my zoom-to-region-of-interest solution.
I was mostly interesting in calculating statistics for 100m buffers
around sampling sites, and extracting irregularly shaped vector blobs
is not as easy (v.extract + g.region zoom=??), e.g. if you have two
small vector blobs in opposite corners of the map with the same cat.
In a C version you might have access to the feature's bounding box,
which could be used to temporarily reset the region to speed up raster
processing. (??)
Markus Metz:
1) Use r.reclass instead of r.mapcalc to create new masks. That
should speed up at least the MASK creation and deletion
certainly worth a try, it is hugely less intensive for the MASK creation.
Giovanni:
Anyway as I can see Glynn has rewritten the same method (r.mapcalc and
r.univar). I confirm that
v.to.rast -> (r.mapcalc to multiply/round FCEDD/DCELL to CELL) ->
r.statistics -> (r.mapcalc to FCELL/DCELL again) -> r.to.vect -> join
original vector to the output one
is absolutely the faster way.
are the results identical?
Markus Metz:
if v.rast.stats is faster in grass7, then probably because of improved
raster libs. A speed increase from >5 hours to 40 seconds is unlikely
since grass.mapcalc is still called 1793 times (assuming each area has
a unique category) for a region with 4415x6632 cells...
If sticking with the MASK + r.univar method, the moving window region-
zoom/reset trick could help.
Giovanni:
The bottleneck is the r.univar limitation to CELL.
?did you mean r.statistics not r.univar?
Markus Metz:
r.univar2.zonal does zonal statistics
pssst- "svn copy" not "svn add". It works between -addons and trunk.
Hamish