On Fri, Oct 20, 2017 at 10:07 PM, Helmut Kudrnovsky <hellik@web.de> wrote:
Can you provide examples where v.in.ogr with cleaning/polygon conversion did
not work, but v.in.ogr -c + >v.clean produced better results?
really good data sets with all kind of (topological) mess are (from [correct
and incorrect] overlapping to self intersections etc):
o Natura 2000 data (~ 1GB):
https://www.eea.europa.eu/data-and-maps/data/natura-8#tab-gis-data
Apparently, there are a lot of polygons in Natura2000 data that are really overlapping, e.g.
SITECODE: UK0030395
SITENAME: Southern North Sea
with
SITECODE: UK0030352
SITENAME: Dogger Bank
Maybe some sites have been updated (both spatial delineation and name), but the old versions have not been deleted
v.in.ogr without snapping gives me lots of warnings about
WARNING: Unable to calculate area centroid
this is a symptom of floating point precision errors, so I tried v.in.ogr with snapping. With snap=1e-3, these warnings disappeared. Small areas could still be removed with v.clean tool=rmarea. In this example, there are still 75 areas smaller than 100 square meters which are most probably noise.
A hint for snapping: v.in.ogr suggest for these data a range of [1e-08, 1] for suitable snapping values. The exponent thus ranges from -8 to 0. Testing all possible values in this range obviously takes a lot of time.
You could set low = -8 , high = 0, and set mid to (low + high) / 2 = -4
Test with snap=1e$mid
If you still get errors, increase: set low to mid, get new mid with (low + high) / 2
else, decrease: set high to mid, get new mid with (low + high) / 2
Continue this until you found the threshold were these warnings just disappeared.
Snapping is slow and uses quite a bit of memory because it needs a spatial search tree. The nearest-neighbor tree (kd tree) currently used could do with some more optimization, but I (as the author of that beast) would need quite a bit of time to come up with a faster balancing method.
o World database of protected areas (~ 1 GB):
https://www.protectedplanet.net/
The real cleaning happens only if the snap option is set to > 0.
v.in.ogr gives some hints about the snap option, sometimes I don’t know what
should be the optimal setting.
noticed that v.in.ogr complains about overlapping areas, which were input
polygons that should not >overlap, but snapping did not help there, instead
I needed to remove small areas afterwards with v.clean.
same experiences here.
Should the current min_area option of v.in.ogr also be used to remove small
areas in the output?
never used this option:
min_area=float
Minimum size of area to be imported (square meters)
Smaller areas and islands are ignored. Should be greater than snap^2
Default: 0.0001
do you mean the small areas shouldn’t be imported or small areas should be
added to the neighbor area with the longest adjacent boundaries?
Thinking about it, small areas should be removed afterwards with v.clean tool=rmarea.
Markus M
best regards
Helmut
Sent from: http://osgeo-org.1560.x6.nabble.com/Grass-Users-f3884509.html
grass-user mailing list
grass-user@lists.osgeo.org
https://lists.osgeo.org/mailman/listinfo/grass-user