[GRASS-dev] Some doubts about GRASS topology

Dear all,

in an attempt to better understand the GRASS vector and topology
model, I imported a set of 3 polygons from an ESRI Shapefile (see
attachment). The polygon in the upper left has 4 holes (called
islands for some reason by GRASS), the lower one consists of 3
parts (QGIS calls this a polygon with islands -- good to know we
understand each other in the GIS world!). The third is a simple,
convex shape.

Displaying the imported map shows all geometries exactly as it
should. So far so good.

Now, when I run v.info on the imported map, I get:

Number of lines: 0
Number of boundaries: 9
Number of centroids: 5
Number of areas: 9
Number of islands: 9

This completely baffles me!

The GRASS documentation consistently states that an area
is a boundary + a centroid + any number of "islands".

Now, assuming that the lines around the four "islands" count
as boundaries, I understand why there are 9 boundaries
altogether. 5 centroids also check out, given that there
is no 1:1 equivalent for a shapefile multipart polygon in GRASS.

But how in the (GRASS) world can there be 9 areas if there
are only 5 centroids? And why 9 islands? It's a mystery to
me. After all those days working on v.out.ogr et al. these
sort of things leave me thinking I have not understood anything
about the GRASS vector model at all.

Could someone clarify please?

Thanks,

Ben

P.S.: I have also attached the simple Shapefile I used.

------
Files attached to this email may be in ISO 26300 format (OASIS Open Document Format). If you have difficulty opening them, please visit http://iso26300.info for more information.

(attachments)

polytest.jpeg
polytest.zip (2.28 KB)

Benjamin Ducke wrote:

Dear all,

in an attempt to better understand the GRASS vector and topology
model, I imported a set of 3 polygons from an ESRI Shapefile (see
attachment). The polygon in the upper left has 4 holes (called
islands for some reason by GRASS), the lower one consists of 3
parts (QGIS calls this a polygon with islands -- good to know we understand each other in the GIS world!).

Yes, maybe the way the term islands is used in GRASS is a bit misleading. According to simple feature specifications, GRASS islands are (more or less, not sure if 100%) equivalent to holes.

The third is a simple,
convex shape.

Displaying the imported map shows all geometries exactly as it
should. So far so good.

Now, when I run v.info on the imported map, I get:

Number of lines: 0
Number of boundaries: 9
Number of centroids: 5
Number of areas: 9
Number of islands: 9

This completely baffles me!

The GRASS documentation consistently states that an area
is a boundary + a centroid + any number of "islands".
  

That's an error in the documentation. An area is a closed ring of boundaries (can be only one boundary) + any number of "islands" (holes) within + *optionally* an attached centroid. An area without centroid can not have a category but as far as topology is concerned, it's a valid area.

Now, assuming that the lines around the four "islands" count
as boundaries, I understand why there are 9 boundaries
altogether. 5 centroids also check out, given that there
is no 1:1 equivalent for a shapefile multipart polygon in GRASS.
  But how in the (GRASS) world can there be 9 areas if there
are only 5 centroids?

See above, an area in GRASS topology does not need to have a centroid attached.

And why 9 islands?

Every area is also an island if no boundary is shared with another area. If a boundary is shared with another area, these two areas together form one island. In your example, the area in the upper left with the four islands: the four islands are also areas, but without centroid attached. When attaching islands during topology building, the internal IDs of all islands falling inside the outer area are added to the topology information of that outer area. If one of these four islands would share a separate boundary with two other islands each, and only one islands would be completely isolated, that thing in the upper left would still consist of five areas (four inside, one outer), but of only three islands, one consisting of three connected areas, one for the remaining isolated inside area, one for the outer area.

When building topology, areas and islands are constructed first, islands are not yet attached to areas. Only in the next step are islands attached to areas, areas get holes. In the last step, centroids are attached to areas, or more precisely: for each area a not yet attached centroid is searched for, if found attached, if already attached, it's a duplicate centroid. There may also be several centroids falling inside the current area and only inside this area, these will also become duplicate centroids.

AFAICT, GRASS vector topology is very much based on simple feature specifications, but not strictly, it deviates here and there in the usage of terms and in the methods to build topology. The methods are not a problem, they are consistent even though not 100% following simple feature specifications, but the usage of terms can be confusing, particularly with misleading documentation and sometimes different meanings in closely related applications (QGIS).

Markus M

Well, that clarifies it (finally)! Thanks very much for taking
the time to write up all this detail. It's much appreciated.

Ben

----- Original Message -----
From: "Markus Metz" <markus.metz.giswork@googlemail.com>
To: "Benjamin Ducke" <benjamin.ducke@oxfordarch.co.uk>
Cc: "GRASS developers list" <grass-dev@lists.osgeo.org>
Sent: Thursday, September 24, 2009 9:04:33 AM GMT +01:00 Amsterdam / Berlin / Bern / Rome / Stockholm / Vienna
Subject: Re: [GRASS-dev] Some doubts about GRASS topology

Benjamin Ducke wrote:

Dear all,

in an attempt to better understand the GRASS vector and topology
model, I imported a set of 3 polygons from an ESRI Shapefile (see
attachment). The polygon in the upper left has 4 holes (called
islands for some reason by GRASS), the lower one consists of 3
parts (QGIS calls this a polygon with islands -- good to know we
understand each other in the GIS world!).

Yes, maybe the way the term islands is used in GRASS is a bit
misleading. According to simple feature specifications, GRASS islands
are (more or less, not sure if 100%) equivalent to holes.

The third is a simple,
convex shape.

Displaying the imported map shows all geometries exactly as it
should. So far so good.

Now, when I run v.info on the imported map, I get:

Number of lines: 0
Number of boundaries: 9
Number of centroids: 5
Number of areas: 9
Number of islands: 9

This completely baffles me!

The GRASS documentation consistently states that an area
is a boundary + a centroid + any number of "islands".
  

That's an error in the documentation. An area is a closed ring of
boundaries (can be only one boundary) + any number of "islands" (holes)
within + *optionally* an attached centroid. An area without centroid can
not have a category but as far as topology is concerned, it's a valid area.

Now, assuming that the lines around the four "islands" count
as boundaries, I understand why there are 9 boundaries
altogether. 5 centroids also check out, given that there
is no 1:1 equivalent for a shapefile multipart polygon in GRASS.
  
But how in the (GRASS) world can there be 9 areas if there
are only 5 centroids?

See above, an area in GRASS topology does not need to have a centroid
attached.

And why 9 islands?

Every area is also an island if no boundary is shared with another area.
If a boundary is shared with another area, these two areas together form
one island. In your example, the area in the upper left with the four
islands: the four islands are also areas, but without centroid attached.
When attaching islands during topology building, the internal IDs of all
islands falling inside the outer area are added to the topology
information of that outer area. If one of these four islands would share
a separate boundary with two other islands each, and only one islands
would be completely isolated, that thing in the upper left would still
consist of five areas (four inside, one outer), but of only three
islands, one consisting of three connected areas, one for the remaining
isolated inside area, one for the outer area.

When building topology, areas and islands are constructed first, islands
are not yet attached to areas. Only in the next step are islands
attached to areas, areas get holes. In the last step, centroids are
attached to areas, or more precisely: for each area a not yet attached
centroid is searched for, if found attached, if already attached, it's a
duplicate centroid. There may also be several centroids falling inside
the current area and only inside this area, these will also become
duplicate centroids.

AFAICT, GRASS vector topology is very much based on simple feature
specifications, but not strictly, it deviates here and there in the
usage of terms and in the methods to build topology. The methods are not
a problem, they are consistent even though not 100% following simple
feature specifications, but the usage of terms can be confusing,
particularly with misleading documentation and sometimes different
meanings in closely related applications (QGIS).

Markus M

------
Files attached to this email may be in ISO 26300 format (OASIS Open Document Format). If you have difficulty opening them, please visit http://iso26300.info for more information.

Benjamin Ducke ha scritto:

Well, that clarifies it (finally)! Thanks very much for taking
the time to write up all this detail. It's much appreciated.

Ben

Hi Ben.
Would you mind documenting this a bit?
It would be good to have in the official grass-doc.
Thanks.
--
Paolo Cavallini: http://www.faunalia.it/pc

Hi Paolo,

I am working on it, but will be busy with other things
for the next two or three weeks. Hopefully towards the
end of the month I will be able to focus on GRASS and
its vector model again.

I will post something on the Wiki as soon as I feel
that I have fully grasped all details. We can then
discuss the text and add it to the official GRASS
docs when its quality will be good enough.

Cheers,

Ben

----- Original Message -----
From: "Paolo Cavallini" <cavallini@faunalia.it>
To: "Benjamin Ducke" <benjamin.ducke@oxfordarch.co.uk>
Cc: "GRASS developers list" <grass-dev@lists.osgeo.org>
Sent: Thursday, September 24, 2009 2:11:15 PM GMT +01:00 Amsterdam / Berlin / Bern / Rome / Stockholm / Vienna
Subject: Re: [GRASS-dev] Some doubts about GRASS topology

Benjamin Ducke ha scritto:

Well, that clarifies it (finally)! Thanks very much for taking
the time to write up all this detail. It's much appreciated.

Ben

Hi Ben.
Would you mind documenting this a bit?
It would be good to have in the official grass-doc.
Thanks.
--
Paolo Cavallini: http://www.faunalia.it/pc

------
Files attached to this email may be in ISO 26300 format (OASIS Open Document Format). If you have difficulty opening them, please visit http://iso26300.info for more information.