[GRASS-user] Zonal Raster Statistics with overlapping polygons

Hi all,

I'm struggling to do Zonal Raster Statistics using v.rast.stats in GRASS GIS.
This tool works neatly and fast, but seems it doesn't deal with overlapping polygons.

I'm using a polygon vector layer with overlapped polygons, I know, is not
topologically correct, but those polygons does't describe physical feature but
just some figurative mask for the statistics, therefore should exist overlapping
and I want the statisics calculated over the whole polygons.

Using v.rast.stats with this vector layer the statistics seem to be calculated
only over the non overlapping part of the feature, I've also tried to import
the vector layer (originally was a shapefile) without build the topology,
but the results seem the same.

I've test also the QGIS ZonalStats plugin, and it works exactly how I want,
the statistics for each polygon are calculate for the whole polygon, is there a method
to do the same in GRASS GIS?

Have someone some hint, possibly smarter and faster than looping ever each feature
and then join each table?

Thanks in advance,
Leonardo

hi
I had a similar problem
I looped over the polygons (with a script)

Grazia

···

2017-02-24 11:48 GMT+01:00 Leonardo <leonardo@mobygis.com>:

Hi all,

I’m struggling to do Zonal Raster Statistics using v.rast.stats in GRASS GIS.
This tool works neatly and fast, but seems it doesn’t deal with overlapping polygons.

I’m using a polygon vector layer with overlapped polygons, I know, is not
topologically correct, but those polygons does’t describe physical feature but
just some figurative mask for the statistics, therefore should exist overlapping
and I want the statisics calculated over the whole polygons.

Using v.rast.stats with this vector layer the statistics seem to be calculated
only over the non overlapping part of the feature, I’ve also tried to import
the vector layer (originally was a shapefile) without build the topology,
but the results seem the same.

I’ve test also the QGIS ZonalStats plugin, and it works exactly how I want,
the statistics for each polygon are calculate for the whole polygon, is there a method
to do the same in GRASS GIS?

Have someone some hint, possibly smarter and faster than looping ever each feature
and then join each table?

Thanks in advance,
Leonardo


grass-user mailing list
grass-user@lists.osgeo.org
https://lists.osgeo.org/mailman/listinfo/grass-user

Thanks for your answer,
I had already thought of a solution of this type,
but I’m looking for a cleaner and smarter solution,
any hints?!

Leonardo

···

On 24/02/2017 15:25, Gra wrote:

hi
I had a similar problem
I looped over the polygons (with a script)

Grazia

2017-02-24 11:48 GMT+01:00 Leonardo <leonardo@mobygis.com>:

Hi all,

I’m struggling to do Zonal Raster Statistics using v.rast.stats in GRASS GIS.
This tool works neatly and fast, but seems it doesn’t deal with overlapping polygons.

I’m using a polygon vector layer with overlapped polygons, I know, is not
topologically correct, but those polygons does’t describe physical feature but
just some figurative mask for the statistics, therefore should exist overlapping
and I want the statisics calculated over the whole polygons.

Using v.rast.stats with this vector layer the statistics seem to be calculated
only over the non overlapping part of the feature, I’ve also tried to import
the vector layer (originally was a shapefile) without build the topology,
but the results seem the same.

I’ve test also the QGIS ZonalStats plugin, and it works exactly how I want,
the statistics for each polygon are calculate for the whole polygon, is there a method
to do the same in GRASS GIS?

Have someone some hint, possibly smarter and faster than looping ever each feature
and then join each table?

Thanks in advance,
Leonardo


grass-user mailing list
grass-user@lists.osgeo.org
https://lists.osgeo.org/mailman/listinfo/grass-user

Le Mon, 27 Feb 2017 09:41:11 +0100,
Leonardo <leonardo@mobygis.com> a écrit :

Thanks for your answer,
I had already thought of a solution of this type,
but I'm looking for a cleaner and smarter solution,
any hints?!

Well the "smarter" in your case would imply to use non-topological
vector format, which was decided for GRASS GIS to not be "smarter".
Yes, this makes some operations a bit more complicated, but you get
paid back with so much less hassle because of spaghetti data. And
writing a simple loop over v.rast.stats is not that difficult. The only
difficulty I see is that v.rast.stats does not have a where= nor cat=
feature selection parameter. This implies that you have to run
v.extract each time. Having these parameters would certainly be useful
and you could file an enhancement wish for those/

I don't know why this would be less "clean"...

The IMHO "cleanest" solution is actually the one that GRASS GIS
implements: When you import a file with overlapping polygons, GRASS GIS
creates a vector map in which the overlapping parts of the polygons
have the category ids of all the polygons that overlap in that area.
And when you run v.rast.stats on that, it will fill the total of each
category value in the attribute table.

Attached you can find an image of a test map with polygon 3 overlapping
both polygon 1 and 2. I also created a second file with only the 2
non-overlapping polygons. Those two were created in QGIS so as to be
able to have "nice" topologically incorrect data.

I then imported the two files into GRASS GIS with v.in.ogr and ran

v.rast.stats map=test raster=elevation column_prefix=e
v.rast.stats map=test_nooverlap raster=elevation column_prefix=e

And here are the results:

v.db.select test

cat|id|e_number|e_minimum|e_maximum|e_range|e_average|e_stddev|e_variance|e_coeff_var|e_sum|e_first_quartile|e_median|e_third_quartile|e_percentile_90
1|1|242621|76.8941955566406|145.712738037109|68.8185424804688|114.397437714797|13.5650727983398|184.01120002426|11.8578467047128|27755220.7358017|104.996|115.337|124.654|131.38
2|2|131854|77.3364105224609|126.733070373535|49.3966598510742|99.4588066985328|10.6819214497898|114.103445859479|10.7400458585508|13114041.4984283|91.5773|98.7388|106.926|114.645
3|3|99039|71.4758224487305|139.941131591797|68.4653091430664|107.223648245541|13.7481896908839|189.012719776527|12.8219752972784|10619322.8985901|97.4993|105.891|118.752|126.703

v.db.select test_nooverlap

cat|id|e_number|e_minimum|e_maximum|e_range|e_average|e_stddev|e_variance|e_coeff_var|e_sum|e_first_quartile|e_median|e_third_quartile|e_percentile_90
1|1|242621|76.8941955566406|145.712738037109|68.8185424804688|114.397437714797|13.5650727983398|184.01120002426|11.8578467047128|27755220.7358017|104.996|115.337|124.654|131.38
2|2|131854|77.3364105224609|126.733070373535|49.3966598510742|99.4588066985328|10.6819214497898|114.103445859479|10.7400458585508|13114041.4984283|91.5773|98.7388|106.926|114.645

You can see that the values for polygons 1 and 2 are strictly
identical. Overlapping areas are thus correctly taken into account.

To test even further, I ran

v.extract -d test cat=3 out=test3

to get the entire polygon with cat=3.

Then

v.rast.stats test3 rast=elevation colprefix=e -c

v.db.select test3

cat|id|e_number|e_minimum|e_maximum|e_range|e_average|e_stddev|e_variance|e_coeff_var|e_sum|e_first_quartile|e_median|e_third_quartile|e_percentile_90
3|3|224074|71.4758224487305|141.793502807617|70.3176803588867|110.21428988784|13.2281044492473|174.982747320197|12.0021681968|24696156.7923279|100.265|110.004|120.916|127.836

There thus seems to be a bug in v.rast.stats as it does not calculate
the correct values for the polygon with cat=3 when there area polygons
with multiple cat values.

I'll post a bug report.

Moritz

(attachments)

example_overlapping_polygons.png

Le Mon, 27 Feb 2017 12:35:11 +0100,
Moritz Lennert <mlennert@club.worldonline.be> a écrit :

Le Mon, 27 Feb 2017 09:41:11 +0100,
Leonardo <leonardo@mobygis.com> a écrit :

[...]

Well the "smarter" in your case would imply to use non-topological
vector format, which was decided for GRASS GIS to not be "smarter".
Yes, this makes some operations a bit more complicated, but you get
paid back with so much less hassle because of spaghetti data. And
writing a simple loop over v.rast.stats is not that difficult. The
only difficulty I see is that v.rast.stats does not have a where= nor
cat= feature selection parameter. This implies that you have to run
v.extract each time. Having these parameters would certainly be useful
and you could file an enhancement wish for those/

I don't know why this would be less "clean"...

The IMHO "cleanest" solution is actually the one that GRASS GIS
implements: When you import a file with overlapping polygons, GRASS
GIS creates a vector map in which the overlapping parts of the
polygons have the category ids of all the polygons that overlap in
that area. And when you run v.rast.stats on that, it will fill the
total of each category value in the attribute table.

Attached you can find an image of a test map with polygon 3
overlapping both polygon 1 and 2. I also created a second file with
only the 2 non-overlapping polygons. Those two were created in QGIS
so as to be able to have "nice" topologically incorrect data.

I then imported the two files into GRASS GIS with v.in.ogr and ran

v.rast.stats map=test raster=elevation column_prefix=e
v.rast.stats map=test_nooverlap raster=elevation column_prefix=e

And here are the results:

> v.db.select test
cat|id|e_number|e_minimum|e_maximum|e_range|e_average|e_stddev|e_variance|e_coeff_var|e_sum|e_first_quartile|e_median|e_third_quartile|e_percentile_90
1|1|242621|76.8941955566406|145.712738037109|68.8185424804688|114.397437714797|13.5650727983398|184.01120002426|11.8578467047128|27755220.7358017|104.996|115.337|124.654|131.38
2|2|131854|77.3364105224609|126.733070373535|49.3966598510742|99.4588066985328|10.6819214497898|114.103445859479|10.7400458585508|13114041.4984283|91.5773|98.7388|106.926|114.645
3|3|99039|71.4758224487305|139.941131591797|68.4653091430664|107.223648245541|13.7481896908839|189.012719776527|12.8219752972784|10619322.8985901|97.4993|105.891|118.752|126.703

> v.db.select test_nooverlap
cat|id|e_number|e_minimum|e_maximum|e_range|e_average|e_stddev|e_variance|e_coeff_var|e_sum|e_first_quartile|e_median|e_third_quartile|e_percentile_90
1|1|242621|76.8941955566406|145.712738037109|68.8185424804688|114.397437714797|13.5650727983398|184.01120002426|11.8578467047128|27755220.7358017|104.996|115.337|124.654|131.38
2|2|131854|77.3364105224609|126.733070373535|49.3966598510742|99.4588066985328|10.6819214497898|114.103445859479|10.7400458585508|13114041.4984283|91.5773|98.7388|106.926|114.645

You can see that the values for polygons 1 and 2 are strictly
identical. Overlapping areas are thus correctly taken into account.

To test even further, I ran

v.extract -d test cat=3 out=test3

to get the entire polygon with cat=3.

Then

v.rast.stats test3 rast=elevation colprefix=e -c
> v.db.select test3
cat|id|e_number|e_minimum|e_maximum|e_range|e_average|e_stddev|e_variance|e_coeff_var|e_sum|e_first_quartile|e_median|e_third_quartile|e_percentile_90
3|3|224074|71.4758224487305|141.793502807617|70.3176803588867|110.21428988784|13.2281044492473|174.982747320197|12.0021681968|24696156.7923279|100.265|110.004|120.916|127.836

There thus seems to be a bug in v.rast.stats as it does not calculate
the correct values for the polygon with cat=3 when there area polygons
with multiple cat values.

I'll post a bug report.

Done: https://trac.osgeo.org/grass/ticket/3300

Moritz

I had understood that v.rast.stats converted the vector file into raster and hence the order in which the vectors were drawn, only one value was generated for the overlapping areas in the raster. This rasterized version was then used to compute the statistics.
Hence, the overlapping areas matched for polygon 1 and not for polygon 3 as the overlap area was coded to “1” in the overlap area during raster conversion.
Regards,
Sitansu

···

On Mon, Feb 27, 2017 at 5:05 PM, Moritz Lennert <mlennert@club.worldonline.be> wrote:

Le Mon, 27 Feb 2017 09:41:11 +0100,
Leonardo <leonardo@mobygis.com> a écrit :

Thanks for your answer,
I had already thought of a solution of this type,
but I’m looking for a cleaner and smarter solution,
any hints?!

Well the “smarter” in your case would imply to use non-topological
vector format, which was decided for GRASS GIS to not be “smarter”.
Yes, this makes some operations a bit more complicated, but you get
paid back with so much less hassle because of spaghetti data. And
writing a simple loop over v.rast.stats is not that difficult. The only
difficulty I see is that v.rast.stats does not have a where= nor cat=
feature selection parameter. This implies that you have to run
v.extract each time. Having these parameters would certainly be useful
and you could file an enhancement wish for those/

I don’t know why this would be less “clean”…

The IMHO “cleanest” solution is actually the one that GRASS GIS
implements: When you import a file with overlapping polygons, GRASS GIS
creates a vector map in which the overlapping parts of the polygons
have the category ids of all the polygons that overlap in that area.
And when you run v.rast.stats on that, it will fill the total of each
category value in the attribute table.

Attached you can find an image of a test map with polygon 3 overlapping
both polygon 1 and 2. I also created a second file with only the 2
non-overlapping polygons. Those two were created in QGIS so as to be
able to have “nice” topologically incorrect data.

I then imported the two files into GRASS GIS with v.in.ogr and ran

v.rast.stats map=test raster=elevation column_prefix=e
v.rast.stats map=test_nooverlap raster=elevation column_prefix=e

And here are the results:

v.db.select test
cat|id|e_number|e_minimum|e_maximum|e_range|e_average|e_stddev|e_variance|e_coeff_var|e_sum|e_first_quartile|e_median|e_third_quartile|e_percentile_90
1|1|242621|76.8941955566406|145.712738037109|68.8185424804688|114.397437714797|13.5650727983398|184.01120002426|11.8578467047128|27755220.7358017|104.996|115.337|124.654|131.38
2|2|131854|77.3364105224609|126.733070373535|49.3966598510742|99.4588066985328|10.6819214497898|114.103445859479|10.7400458585508|13114041.4984283|91.5773|98.7388|106.926|114.645
3|3|99039|71.4758224487305|139.941131591797|68.4653091430664|107.223648245541|13.7481896908839|189.012719776527|12.8219752972784|10619322.8985901|97.4993|105.891|118.752|126.703

v.db.select test_nooverlap
cat|id|e_number|e_minimum|e_maximum|e_range|e_average|e_stddev|e_variance|e_coeff_var|e_sum|e_first_quartile|e_median|e_third_quartile|e_percentile_90
1|1|242621|76.8941955566406|145.712738037109|68.8185424804688|114.397437714797|13.5650727983398|184.01120002426|11.8578467047128|27755220.7358017|104.996|115.337|124.654|131.38
2|2|131854|77.3364105224609|126.733070373535|49.3966598510742|99.4588066985328|10.6819214497898|114.103445859479|10.7400458585508|13114041.4984283|91.5773|98.7388|106.926|114.645

You can see that the values for polygons 1 and 2 are strictly
identical. Overlapping areas are thus correctly taken into account.

To test even further, I ran

v.extract -d test cat=3 out=test3

to get the entire polygon with cat=3.

Then

v.rast.stats test3 rast=elevation colprefix=e -c

v.db.select test3
cat|id|e_number|e_minimum|e_maximum|e_range|e_average|e_stddev|e_variance|e_coeff_var|e_sum|e_first_quartile|e_median|e_third_quartile|e_percentile_90
3|3|224074|71.4758224487305|141.793502807617|70.3176803588867|110.21428988784|13.2281044492473|174.982747320197|12.0021681968|24696156.7923279|100.265|110.004|120.916|127.836

There thus seems to be a bug in v.rast.stats as it does not calculate
the correct values for the polygon with cat=3 when there area polygons
with multiple cat values.

I’ll post a bug report.

Moritz


grass-user mailing list
grass-user@lists.osgeo.org
https://lists.osgeo.org/mailman/listinfo/grass-user

Le Mon, 27 Feb 2017 17:45:37 +0530,
Sitansu Pattnaik <sitansu@gmail.com> a écrit :

I had understood that v.rast.stats converted the vector file into
raster and hence the order in which the vectors were drawn, only one
value was generated for the overlapping areas in the raster. This
rasterized version was then used to compute the statistics.
Hence, the overlapping areas matched for polygon 1 and not for
polygon 3 as the overlap area was coded to "1" in the overlap area
during raster conversion.

Yes, that is correct. I just posted the approximate path to a possible
solution in the trac ticket [1].

Moritz

[1] https://trac.osgeo.org/grass/ticket/3300

Hi,

2017-02-24 15:25 GMT+01:00 Gra <graziaz@gmail.com>:

I had a similar problem
I looped over the polygons (with a script)

right, thats only way for overlapping polygons. Recently I wrote
similar script to compute statistics for overlapping tiles (polygons)
linked from PostGIS database [1]. (Link flag requires GRASS 7.3 -
trunk). Ma

[1] https://trac.osgeo.org/grass/browser/sandbox/martinl/tiles-rast-stats.py

--
Martin Landa
http://geo.fsv.cvut.cz/gwiki/Landa
http://gismentors.cz/mentors/landa

Thank you all for the detailed answers, also with attached example!

I totally agree with you, Moritz, about the "smart" and "clean" concept, the way GRASS
manage topology and the statistics in you example for polygon 1 and 2 is perfect
(it correctly takes into account overlapping areas), but the problem became with polygon 3.

Before posting the question I build an example similar to yours one,
but with only two overlapped polygons, polygon 1 has statistics calculated
over the whole area (also the overlapped part, like yours 1 and 2) but the
polygon two has only stats calculated over the non overlapped area (like your 3).
This behavior is perfectly coherent with the technical explanation given from
Sitansu, only the first polygon found in the rasterization is used,
then I am afraid that is not a bug as you suppose, but the behavior of v.rast.stats.

Martin, what a displeasure hear from you too that the only way is writing a loop,
thank you very much for sharing your script that does it!

Leonardo

On 27/02/2017 13:51, Martin Landa wrote:

Hi,

2017-02-24 15:25 GMT+01:00 Gra <graziaz@gmail.com>:

I had a similar problem
I looped over the polygons (with a script)

right, thats only way for overlapping polygons. Recently I wrote
similar script to compute statistics for overlapping tiles (polygons)
linked from PostGIS database [1]. (Link flag requires GRASS 7.3 -
trunk). Ma

[1] https://trac.osgeo.org/grass/browser/sandbox/martinl/tiles-rast-stats.py