[GRASS-dev] v.build: why is attaching islands taking so long?

Hi

Can anybody possibly explain why does the "Attaching islands..." in
v.build take so very long compared to other operations?

For example: I have a big vector map consisted only of lines, more than
180 000 of them. v.build on this dataset takes about 20 seconds. After
converting lines into boundaries (v.type), v.build on that new vector
map made of boundaries only takes half a day, with 99% of processing
time dedicated to "Attaching islands...". Is that normal? Any
improvements possible?

Maciek

--
Maciej Sieczka
www.sieczka.org

Maciej Sieczka wrote:

Can anybody possibly explain why does the "Attaching islands..." in
v.build take so very long compared to other operations?
For example: I have a big vector map consisted only of lines, more than
180 000 of them. v.build on this dataset takes about 20 seconds. After
converting lines into boundaries (v.type), v.build on that new vector
map made of boundaries only takes half a day, with 99% of processing
time dedicated to "Attaching islands...". Is that normal? Any
improvements possible?

the not-thinking-about-it-much guess is to blame the "florida coastline"
problem. to test is that is the problem, try running 'v.split vert=1000'
before v.type and see if it still takes a long time.

you might try running a subset through a profiling tool if you like:
  http://grass.osgeo.org/wiki/GRASS_Debugging#Using_a_profiling_tool

does the boundary data contain many small slivers?

Hamish

Hamish pisze:

Maciej Sieczka wrote:

Can anybody possibly explain why does the "Attaching islands..." in
v.build take so very long compared to other operations? For
example: I have a big vector map consisted only of lines, more than
180 000 of them. v.build on this dataset takes about 20 seconds.
After converting lines into boundaries (v.type), v.build on that
new vector map made of boundaries only takes half a day, with 99%
of processing time dedicated to "Attaching islands...". Is that
normal? Any improvements possible?

the not-thinking-about-it-much guess is to blame the "florida
coastline" problem. to test is that is the problem, try running
'v.split vert=1000' before v.type and see if it still takes a long
time.

'v.split vert=1000' reduces the processing time from over 10 hours to 10
minutes. Strange world we live in.

you might try running a subset through a profiling tool if you like: http://grass.osgeo.org/wiki/GRASS_Debugging#Using_a_profiling_tool

If I find time I'll try and post here.

does the boundary data contain many small slivers?

Yes. It's GSHHS actually.

Thanks for hints Hamish!

Maciek

--
Maciej Sieczka
www.sieczka.org

Maciej Sieczka wrote:

>> Can anybody possibly explain why does the "Attaching islands..." in
>> v.build take so very long compared to other operations? For
>> example: I have a big vector map consisted only of lines, more than
>> 180 000 of them. v.build on this dataset takes about 20 seconds.
>> After converting lines into boundaries (v.type), v.build on that
>> new vector map made of boundaries only takes half a day, with 99%
>> of processing time dedicated to "Attaching islands...". Is that
>> normal? Any improvements possible?

Hamish:

> the not-thinking-about-it-much guess is to blame the "florida
> coastline" problem. to test is that is the problem, try running
> 'v.split vert=1000' before v.type and see if it still takes a long
> time.

'v.split vert=1000' reduces the processing time from over 10 hours to
10 minutes. Strange world we live in.

> you might try running a subset through a profiling tool if you like:
http://grass.osgeo.org/wiki/GRASS_Debugging#Using_a_profiling_tool

If I find time I'll try and post here.

not really required to run the profiler now we know what the problem
probably is. from Radim's
  http://trac.osgeo.org/grass/browser/grass/trunk/doc/vector/TODO#L241

"v.in.ogr
--------
It would be useful to split long boundaries to smaller
pieces. Otherwise cleaning process can become very slow because
bounding box of long boundaries can overlap large part of the map (for
example outline around all areas) and cleaning process is checking
intersection with all boundaries falling in the bounding box."

see
http://article.gmane.org/gmane.comp.gis.grass.devel/7400
   mailing list link there needs to be adjusted by +193, now:
http://lists.osgeo.org/pipermail/grass-user/2005-April/028711.html
   and
http://intevation.de/rt/webrt?serial_num=3161

Hamish

Hamish pisze:

from Radim's http://trac.osgeo.org/grass/browser/grass/trunk/doc/vector/TODO#L241

"v.in.ogr -------- It would be useful to split long boundaries to
smaller pieces. Otherwise cleaning process can become very slow
because bounding box of long boundaries can overlap large part of the
map (for example outline around all areas) and cleaning process is
checking intersection with all boundaries falling in the bounding
box."

see http://article.gmane.org/gmane.comp.gis.grass.devel/7400 mailing
list link there needs to be adjusted by +193, now: http://lists.osgeo.org/pipermail/grass-user/2005-April/028711.html and http://intevation.de/rt/webrt?serial_num=3161

Hi,

Skiming through GDAL 1.6 announce I noticed the following:

"Add a segmentize() method to OGRGeometry to modify the geometry such it
has no segment longer then the given distance; add a -segmentize option
to ogr2ogr"

Relevant?

Maciek

--
Maciej Sieczka
www.sieczka.org

Hamish pisze:
"Florida coastline" problem

Maciej wrote:

Skiming through GDAL 1.6 announce I noticed the following:

"Add a segmentize() method to OGRGeometry to modify the geometry such it
has no segment longer then the given distance; add a -segmentize option
to ogr2ogr"

Relevant?

perhaps, but distance would need to by by number of vertices, not map
distance - as maps can be of many scales and there is no single "max
distance". (I guess you could calculate map bounds and use 5% of max
side..) And you'd need a GDAL version check to stay backwards compatible
with older GDALs for the next 2-3 years or so. (extended Debian life cycle)

but I wonder, would a simple, but large, box of 4 vertices (corners) which
surrounds a complicated vector map (many islands) still take so long to
build? in that case, reducing the number of sequential vertices wouldn't
help.
???

probably need to run some trials with v.in.region + v.patch to test that.

Hamish