[GRASS5] GRASS -> SF

Hi all,

I would like to hear your suggestions on how to represent
GRASS vectors as simple features. Probably this is not precise,
I did not studied all OGC documents, I mean how to represent
GRASS vectors in free GIS software which uses OGC specifications,
for example OGR, PostGIS, QGIS.

How to divide GRASS features in one vector to appropriate layers?

Note that features in GRASS vector may have attributes
in different tables or may be without attributes. Boundaries
forms areas but it may happen that some boundaries are not closed
(such boundaries would not appear in polygon layer).
Boundaries may have attributes. All types may be mixed in one vector.

Simply, GRASS vector is a jungle and I am looking for a way
how to represent this flexible format in software like OGR without making
to much confusion between users.

Radim

Radim Blazek wrote:

Hi all,

I would like to hear your suggestions on how to represent GRASS vectors as simple features. Probably this is not precise, I did not studied all OGC documents, I mean how to represent GRASS vectors in free GIS software which uses OGC specifications, for example OGR, PostGIS, QGIS.

How to divide GRASS features in one vector to appropriate layers?

Note that features in GRASS vector may have attributes
in different tables or may be without attributes. Boundaries forms areas but it may happen that some boundaries are not closed
(such boundaries would not appear in polygon layer).
Boundaries may have attributes. All types may be mixed in one vector.

Simply, GRASS vector is a jungle and I am looking for a way how to represent this flexible format in software like OGR without making
to much confusion between users.

Radim,

I skimmed the following document to try to further understand the current
vector data model, but I found it unhelpful on the area of topology.

   http://freegis.org/cgi-bin/viewcvs.cgi/~checkout~/grass51/doc/vector/vector.html#topo

I'm not sure if you are primarily interested in approaches that should
be taken when exporting to a simple features type format, or with how "live
access" should be provided to simple features applications, as you might do
in an OGR driver for GRASS vector format for instance.

As I see it there are a few issues:

1) Q: Should different geometry types within a GRASS layer be seperated into
    different layers for simple features purposes?

    A: The simple features data model supports a concept of layers that may
    contain all geometry types. Therefore, I don't think you should split
    things up into different layers by geometry type by default. This would
    be a useful option in an export program though, as many formats (ie.
    shapefiles) do have restriction requiring only one geometry type per layer.

2) Q: Should only the category attribute be considered part of the feature
    or also the other fields from tables that can be referenced by the category.

    A: I'm not clear about whether vector layers include persistent references
    to the table(s) the category should be used to reference, or if this linkage
    info is only provided on a command-by-command basis. If there is a persistent
    linkage then I think that the default should be to join all attributes into
    the feature based on the category id(s). On export both modes should be
    supported, with the default being to join if the linkages is known.

3) Q: How should topological relationships be preserved.

    A: I am still unclear how this is handled in GRASS. Are areas represented
    as boundary objects in GRASS? Do these boundaries include the whole edge
    geometry, or just a reference to arcs (GV_LINES) that make up the boundary?
    In general I think you should export GV_BOUNDARY objects as whole polygons,
    even if it means collecting up arcs to form the polygon geometry. Topological
    relationships should be be preserved to the extent possible as attributes on
    the features. I am unware of any really widely accepted convention for this.

    We discussion options for special topology support in OGR some time ago (last
    summer?) but I haven't done anything about it at this time.

I may have missed it, but I would like to see a GRASS Vector Data Model document
prepared. This might possibly be a part of the vector format document, though
ideally as a part of a users manual with references off to a format document.

Best regards,

--
---------------------------------------+--------------------------------------
I set the clouds in motion - turn up | Frank Warmerdam, warmerdam@pobox.com
light and sound - activate the windows | http://pobox.com/~warmerdam
and watch the world go round - Rush | Geospatial Programmer for Rent

On Tuesday 17 February 2004 19:31, Frank Warmerdam wrote:

Radim Blazek wrote:
> I would like to hear your suggestions on how to represent
> GRASS vectors as simple features ....

Radim,

I skimmed the following document to try to further understand the current
vector data model, but I found it unhelpful on the area of topology.
  
http://freegis.org/cgi-bin/viewcvs.cgi/~checkout~/grass51/doc/vector/vector
.html#topo

In past 4 years, you are probably the second person interested in that,
so I don't spend much time documenting data model ...

I'm not sure if you are primarily interested in approaches that should
be taken when exporting to a simple features type format, or with how "live
access" should be provided to simple features applications, as you might do
in an OGR driver for GRASS vector format for instance.

I am interested in "live access". For export serves v.out.ogr,
which can be changed later without problems, but layer names for
live access and their meaning should not change frequently.

As I see it there are a few issues:

1) Q: Should different geometry types within a GRASS layer be seperated
into different layers for simple features purposes?

    A: The simple features data model supports a concept of layers that may
    contain all geometry types. Therefore, I don't think you should split
    things up into different layers by geometry type by default. This
would be a useful option in an export program though, as many formats (ie.
shapefiles) do have restriction requiring only one geometry type per layer.

That is true, but I worry, that some (many?) applications support only layers
of certain type (point,line,polygon) and they will not work with mixture
of types. Often it is logical, everything is much simpler if legend
has just one symbol etc.
Does anybody know a GIS viewer which can display more types in one layer?

2) Q: Should only the category attribute be considered part of the feature
    or also the other fields from tables that can be referenced by the
category.

    A: I'm not clear about whether vector layers include persistent
references to the table(s) the category should be used to reference, or if
this linkage info is only provided on a command-by-command basis. If there
is a persistent linkage then I think that the default should be to join all
attributes into the feature based on the category id(s). On export both
modes should be supported, with the default being to join if the linkages
is known.

Link to the table is permanent and it is stored in 'dbln' file in vector
directory. Tables are considered to be a part of the vector and g.remove,
for example, deletes linked tables of the vector.
Yes, attributes must be joined with geometry.

3) Q: How should topological relationships be preserved.

    A: I am still unclear how this is handled in GRASS. Are areas
represented as boundary objects in GRASS? Do these boundaries include the
whole edge geometry, or just a reference to arcs (GV_LINES) that make up
the boundary? In general I think you should export GV_BOUNDARY objects as
whole polygons, even if it means collecting up arcs to form the polygon
geometry. Topological relationships should be be preserved to the extent
possible as attributes on the features. I am unware of any really widely
accepted convention for this.

GV_BOUNDARY contains geometry and it is used to build areas.
GV_LINE cannot form an area.
I agree, that whole polygons must be available, but boundaries and centroid
in raw form should be available as well (import to another topological GIS,
display errors in data (boundary is not closed)).

Radim

Chris G. Nicholas wrote:
> If anyone expects this to be more than academic, i.e. in
> production within mission-critical government and industry
> applications, GRASS had better support a 'live link' to a real
> database, like PostGIS.

Chris,

I would contend that lots of real production work is done without
databases, but nevertheless you are correct that the ability to
hold data in a spatial database is important, and I am pretty sure
it is already implemented in the new GRASS vector architecture. Though
my vague recollection is that there are performance issues with the
way that PostGIS is currently used from GRASS. I'm not sure how true
that is.

Radim Blazek wrote:

In past 4 years, you are probably the second person interested in that, so I don't spend much time documenting data model ...

I sympathize, but I still think it will be valuable if you want people
to be able to take advantage of the new vector capabilities effectively
(and without undue hand holding).

That is true, but I worry, that some (many?) applications support only layers
of certain type (point,line,polygon) and they will not work with mixture
of types. Often it is logical, everything is much simpler if legend has just one symbol etc.

>

Does anybody know a GIS viewer which can display more types in one layer?

I agree that this is a common restriction, but by no means universal.
OpenEV, and PCI's viewers all support hetrogeneous geometries in a single
layer. CAD programs (ie. Microstation) generally support diverse geometries
in a layer. I think Mapinfo does as well, at least .tab files can mix
geometry types in a layer. Of course, in use, even then people will often
only keep a single type of geometry in a given layer.

I think it would be helpful if the type(s) of geometries present in a layer
were readily available to application linking to GRASS data. That would
make it easier for them to choose to describe a given layer as multiple
layers of segregated geometry types without having to always assume all types
are present. In OGR this is partially accomplished by data sources setting
the geometry type for a layer if it is restricted.

GV_BOUNDARY contains geometry and it is used to build areas. GV_LINE cannot form an area. I agree, that whole polygons must be available, but boundaries and centroid
in raw form should be available as well (import to another topological GIS,
display errors in data (boundary is not closed)).

I think I still don't get how the boundaries work. In a well formed polygon
layer, is the one closed boundary per polygon feature? If there are centroids
how are they associated with their polygon?

Best regards,
--
---------------------------------------+--------------------------------------
I set the clouds in motion - turn up | Frank Warmerdam, warmerdam@pobox.com
light and sound - activate the windows | http://pobox.com/~warmerdam
and watch the world go round - Rush | Geospatial Programmer for Rent

On Wednesday 18 February 2004 18:52, Frank Warmerdam wrote:

Radim Blazek wrote:
> In past 4 years, you are probably the second person interested in that,
> so I don't spend much time documenting data model ...

I sympathize, but I still think it will be valuable if you want people
to be able to take advantage of the new vector capabilities effectively
(and without undue hand holding).

GRASS data model is briefly described here
http://www.ing.unitn.it/~grass/conferences/GRASS2002/proceedings/proceedings/pdfs/Blazek_Radim.pdf

I think that important for an effective use of GRASS vectors is to know that:
1) geometry and attributes are stored separately
   (don't read both if it is not necessary (usually it is not))
2) the format is topological (areas build from boundaries)

GRASS model isn't too complex and I am not the only one who can describe it.
I would welcome a document written by native English speaker or simply
by anyone who can discribe it clearly in English. I can assist.

> That is true, but I worry, that some (many?) applications support only
> layers of certain type (point,line,polygon) and they will not work with
> mixture of types. Often it is logical, everything is much simpler if
> legend has just one symbol etc.
>
> Does anybody know a GIS viewer which can display more types in one layer?

I agree that this is a common restriction, but by no means universal.
OpenEV, and PCI's viewers all support hetrogeneous geometries in a single
layer. CAD programs (ie. Microstation) generally support diverse
geometries in a layer. I think Mapinfo does as well, at least .tab files
can mix geometry types in a layer. Of course, in use, even then people
will often only keep a single type of geometry in a given layer.

I think it would be helpful if the type(s) of geometries present in a layer
were readily available to application linking to GRASS data. That would
make it easier for them to choose to describe a given layer as multiple
layers of segregated geometry types without having to always assume all
types are present. In OGR this is partially accomplished by data sources
setting the geometry type for a layer if it is restricted.

How geometry type can be set in OGR? I see only SetAttributeFilter
and SetSpatialFilter?

What you suggest may be solution if an application knows about that possibility,
and supports it. But it may happen that it simply refuses to open
anythig else than WKBPoint,WKBLine,WKBPolygon.

Would it be acceptable to duplicate features in more layers (if necessary)?
?_point
?_line
?_all // contains features from both ?_point and ?_line

> GV_BOUNDARY contains geometry and it is used to build areas.
> GV_LINE cannot form an area.
> I agree, that whole polygons must be available, but boundaries and
> centroid in raw form should be available as well (import to another
> topological GIS, display errors in data (boundary is not closed)).

I think I still don't get how the boundaries work. In a well formed
polygon layer, is the one closed boundary per polygon feature?

No, polygon may be formed by many boundaries (more primitives but connected).
One boundary is shared by adjacent areas.

+--1--+--5--+
| | |
2 A 4 B 6
| | |
+--3--+--7--+

1,2,3,4,5,6,7 = 7 boundaries (primitives)
A,B = 2 areas

If there are centroids how are they associated with their polygon?

Centroid are assigned to area it is within/inside (geometrically).

Radim

On Thu, 19 Feb 2004, Radim Blazek wrote:

On Wednesday 18 February 2004 18:52, Frank Warmerdam wrote:
> Radim Blazek wrote:
> > In past 4 years, you are probably the second person interested in that,
> > so I don't spend much time documenting data model ...
>
> I sympathize, but I still think it will be valuable if you want people
> to be able to take advantage of the new vector capabilities effectively
> (and without undue hand holding).

Me too. I store many vectors in Postgis nowadays, and it would be
really cool to be able to use v.digit and other Grass tools on them, as
well as directly accessing Grass vectors from Mapserver using Postgis as
the main storage, i.e. getting rid of all these exports/imports.

...

Would it be acceptable to duplicate features in more layers (if necessary)?
?_point
?_line
?_all // contains features from both ?_point and ?_line

> > GV_BOUNDARY contains geometry and it is used to build areas.
> > GV_LINE cannot form an area.
> > I agree, that whole polygons must be available, but boundaries and
> > centroid in raw form should be available as well (import to another
> > topological GIS, display errors in data (boundary is not closed)).
>
> I think I still don't get how the boundaries work. In a well formed
> polygon layer, is the one closed boundary per polygon feature?

No, polygon may be formed by many boundaries (more primitives but connected).
One boundary is shared by adjacent areas.

+--1--+--5--+
| | |
2 A 4 B 6
| | |
+--3--+--7--+

1,2,3,4,5,6,7 = 7 boundaries (primitives)
A,B = 2 areas

> If there are centroids how are they associated with their polygon?

Centroid are assigned to area it is within/inside (geometrically).

What about centroid rules? When I imported (+ rebuilt topology) polygons
from e00 to Postgis I had to drop part of the 'polygons' (actually,
I imported them as lines just like the open ended arcs) because inner
rings touched the outer ring at more than one single point.

Following is an illegal inner ring. In Mapserver it is not possible to
fill A with a color without filling B too.

+---------+
| A |
+-----+ |
| | |
| B | |
+-----+ |
| |
+---------+

So, would the new Grass vector format allow this topology, and could flood
fills still be handled correctly?

rgds
Morten Hulden

On Saturday 21 February 2004 01:29, Morten Hulden wrote:

What about centroid rules? When I imported (+ rebuilt topology) polygons
from e00 to Postgis I had to drop part of the 'polygons' (actually,
I imported them as lines just like the open ended arcs) because inner
rings touched the outer ring at more than one single point.

Following is an illegal inner ring. In Mapserver it is not possible to
fill A with a color without filling B too.

+---------+
| A |
+-----+ |
| B | |
+-----+ |
| |
+---------+

So, would the new Grass vector format allow this topology, and could flood
fills still be handled correctly?

??? I don't understand why the example above should not be handled correctly
in GRASS, regardless version. How GRASS is involved if Mapserver doesn't
do what you want. Did you use GRASS to import e00 to PostGIS?

In GRASS, whenever 'inner' ring touches the boundary of outside area, even in
one point, it is no more 'inner' ring, it is simply another area.
A, B above can never be exported from GRASS as polygon A with inner ring B
because there are only 2 areas A and B and no island.

A problem could appear if both areas A and B had the same category (attributes).
Then it would be logical to export them as MultiPolygon. In this case
however, the MultiPolygon would be incorrect (SF specification), because
"The Boundaries of any 2 Polygons that are elements of a MultiPolygon
  may not cross and may touch at only a finite number of points."
But I am sure, that GRASS currently does not export any Multi* features.

BTW, I don't see any reason for that rescriction defined for multipolygons
in SF specification. Does anybody has idea why MultiPolygon parts cannot
touch along a line?

Radim

On Sat, 21 Feb 2004, Radim Blazek wrote:

On Saturday 21 February 2004 01:29, Morten Hulden wrote:
> What about centroid rules? When I imported (+ rebuilt topology) polygons
> from e00 to Postgis I had to drop part of the 'polygons' (actually,
> I imported them as lines just like the open ended arcs) because inner
> rings touched the outer ring at more than one single point.
>
> Following is an illegal inner ring. In Mapserver it is not possible to
> fill A with a color without filling B too.
>
> +---------+
> | A |
> +-----+ |
> | B | |
> +-----+ |
> | |
> +---------+
>
> So, would the new Grass vector format allow this topology, and could flood
> fills still be handled correctly?

??? I don't understand why the example above should not be handled correctly
in GRASS, regardless version. How GRASS is involved if Mapserver doesn't
do what you want. Did you use GRASS to import e00 to PostGIS?

No, I wrote a perl-script that parsed a e00-file and rebuilt the topology
in Postgis (POLYGON and LINESTRING) format. (MULTIPOLYGON and
MULTILINESTRING did not exist in Postgis at that time). Selected
attributes were extcacted in a second pass with e00pg
(http://e00pg.sourceforge.net), with a little perl-parsing of the output.

I am talking about the DCW e00-files. I tried to use m.in.e00 in Grass but
got too many 'Unclosed area, free end or edge inside area' and other
errors, too tidy to clean up afterwars, too many countries to handle etc..

Since the end purpose was to populate Mapserver with vector maps of many
countries I decided to use the perl approach, cleaning and splitting into
polygons and linstrings in the process, and wrapping everything up in a
shell script so several countries could be imported simultaneously. There
was another advantage: re-projection is not necessary until maps are
rendered, and that is handled by Mapserver 'on-the-fly'.

I also rationalize a bit and use SQL-hierarchy, e.g. dn_py and dn_ln
parent tables contain 'drainage network' polygons and lines, respectively,
and are inherited by dn_py_reg and dn_ln_reg, where 'reg' is the three
letter acronym for the country.

So, for me it would be very useful if I could interface the vector data in
Postgis from both Mapserver and Grass. I use Grass alot for raster map
generating and v.digitizing, but Grass is becoming a little 'isolated' now
from the vector storage. Maybe there is something in 5.7 that I don't know
about, but I haven't been able to get it to compile yet :frowning:

regards
Morten

PS. my perl script is at http://gis.untamo.net/download/e002pgis.pl and my
mapserver is at http://gis.untamo.net/mapserv

On Sat, 21 Feb 2004, Morten Hulden wrote:

So, for me it would be very useful if I could interface the vector data in
Postgis from both Mapserver and Grass. I use Grass alot for raster map
generating and v.digitizing, but Grass is becoming a little 'isolated' now
from the vector storage. Maybe there is something in 5.7 that I don't know
about, but I haven't been able to get it to compile yet :frowning:

Only yesterday I managed to compile 5.7 so I got a chance to test Postgis
connectivity. (--with-nls causes errors in lib/raster/io.c, PACKAGE not
defined)

Vectors and attributes in Postgis/Postgres works fine, but areas are very,
very slow to render. As I understand it only lines and points are stored
in Postgris, but topology is rebuilt each time the monitor is redrawn.
That is bound to be slow, especially as the whole map is rebuilt, not only
the visible area.

It should be possible to make it faster, e.g. by using the Postgis BBOX
function to build only visible areas. To store full polygons in Postgis of
cause would help too, but that may not be easy to achieve. Maybe through
views?

Retrieving full polygons is very fast, so Postgis itself is not the
problem; retrieving thousands of polygons and lines reprojecting them on
the fly takes only a few seconds in Mapserver.

rgds
Morten

On Monday 23 February 2004 19:31, Morten Hulden wrote:

On Sat, 21 Feb 2004, Morten Hulden wrote:
> So, for me it would be very useful if I could interface the vector data
> in Postgis from both Mapserver and Grass. I use Grass alot for raster map
> generating and v.digitizing, but Grass is becoming a little 'isolated'
> now from the vector storage. Maybe there is something in 5.7 that I don't
> know about, but I haven't been able to get it to compile yet :frowning:

Only yesterday I managed to compile 5.7 so I got a chance to test Postgis
connectivity. (--with-nls causes errors in lib/raster/io.c, PACKAGE not
defined)

Vectors and attributes in Postgis/Postgres works fine, but areas are very,
very slow to render. As I understand it only lines and points are stored
in Postgris, but topology is rebuilt each time the monitor is redrawn.
That is bound to be slow, especially as the whole map is rebuilt, not only
the visible area.

No, topology is built once when the vector is created. I think that slow
is to read boundaries by id from PostGIS to compose areas. This is fast
on disk (set offset and read coordinates) but slow with database
(create - send - parse query, send - recieve data).

It should be possible to make it faster, e.g. by using the Postgis BBOX
function to build only visible areas. To store full polygons in Postgis of
cause would help too, but that may not be easy to achieve. Maybe through
views?

Retrieving full polygons is very fast, so Postgis itself is not the
problem; retrieving thousands of polygons and lines reprojecting them on
the fly takes only a few seconds in Mapserver.

I don't say that PostGIS is the problem, the problem is that GRASS data
model is so different from SF.

I think that the key problem is how to get line coordinates selected by
line id faster. Usually this is not problem for PostGIS clients
because selection is done on server side. If GRASS loops through
areas it must for each area read boundaries by boundary id.
'select ... where id = x' may never be fast enough.
So it must select everything and access results by tuple number
PQgetvalue(). But nobody seems to be interested in contributions
to vectors and I have other things to do.

Another interesting thing would be to write areas as polygons to
PostGIS table when Vect_build is run or the map is updated on level 2.
That way areas digitised in GRASS would be accessible directly
in PostGIS.

Radim

On Mon, Feb 23, 2004 at 07:31:34PM +0100, Morten Hulden wrote:

On Sat, 21 Feb 2004, Morten Hulden wrote:

> So, for me it would be very useful if I could interface the vector data in
> Postgis from both Mapserver and Grass. I use Grass alot for raster map
> generating and v.digitizing, but Grass is becoming a little 'isolated' now
> from the vector storage. Maybe there is something in 5.7 that I don't know
> about, but I haven't been able to get it to compile yet :frowning:

Only yesterday I managed to compile 5.7 so I got a chance to test Postgis
connectivity. (--with-nls causes errors in lib/raster/io.c, PACKAGE not
defined)

The --with-nls Makefile bugs are fixed now (otherwise, please report).

Vectors and attributes in Postgis/Postgres works fine, but areas are very,
very slow to render.

Radim has updated two days ago 'd.vect' to render only areas in the
current region. Should be faster now (up to 3 times for me).
Please try again.

Markus

Hi,

I have somewhat updated the 5.7 vector document, including comments
from the recent discussion.

In grass51/ CVS:
   doc/vector/vector.html
or
  http://freegis.org/cgi-bin/viewcvs.cgi/~checkout~/grass51/doc/vector/vector.html

My suggestion is to expand/update/clean this document.

Later we can move it into the vector library and doxygenize it
for inclusion into the programmer's manual.

Markus