[GRASS5] [bug #4075] (grass) grass and postgis issue

this bug's URL: http://intevation.de/rt/webrt?serial_num=4075
-------------------------------------------------------------------------

Subject: grass and postgis issue

Platform: Mac OSX
grass obtained from: Trento Italy site
grass binary for platform: Compiled from Sources
GRASS Version: cvs 2006

I am working with postgis tables and GRASS and have found what may be a bug.

If I try and read a postgis table (with geometry columns and spatial indexing into GRASS, I get a new
vector but the associated postgis table has no geometry or spatial indexing. For exmaple:

postgis output from Tyler Ms postgis example in "Web Mapping":

project1=# \d countyp020
                                   Table "public.countyp020"
   Column | Type | Modifiers
------------+-----------------------
+----------------------------------------------------------
gid | integer | not null default nextval('countyp020_gid_seq'::regclass)
area | double precision |
perimeter | double precision |
countyp020 | bigint |
state | character varying(2) |
county | character varying(50) |
fips | character varying(5) |
state_fips | character varying(2) |
square_mil | double precision |
the_geom | geometry |
Indexes:
    "countyp020_pkey" PRIMARY KEY, btree (gid)
    "idx_countyp020_geo" gist (the_geom)
    "idx_countyp020_gid" btree (gid)
Check constraints:
    "enforce_dims_the_geom" CHECK (ndims(the_geom) = 2)
    "enforce_geotype_the_geom" CHECK (geometrytype(the_geom) = 'MULTIPOLYGON'::text OR
the_geom IS NULL)
    "enforce_srid_the_geom" CHECK (srid(the_geom) = -1)

However, reading the table back into GRASS with either pgsql2shp, then v.in.ogr on the shapefile, or
directly into GRASS with v.in.ogr on the postgis table, results in a new postgis table that looks like
this:

project1=# \d countyp020_postgis
        Table "public.countyp020_postgis"
   Column | Type | Modifiers
------------+------------------------+-----------
cat | integer |
gid | integer |
area | double precision |
perimeter | double precision |
countyp020 | integer |
state | character varying(255) |
county | character varying(255) |
fips | character varying(255) |
state_fips | character varying(255) |
square_mil | double precision |
Indexes:
    "countyp020_postgis_cat" UNIQUE, btree (cat)

or:

project1=# \d countyp020_grass
        Table "public.countyp020_grass"
   Column | Type | Modifiers
------------+-----------------------+-----------
cat | integer |
area | double precision |
perimeter | double precision |
countyp020 | double precision |
state | character varying(2) |
county | character varying(50) |
fips | character varying(5) |
state_fips | character varying(2) |
square_mil | double precision |
Indexes:
    "countyp020_grass_cat" UNIQUE, btree (cat)

Notice the lack of geometry and idexes in both posgis tables. Also, the difference in type lengths
(character 5 vs 250 for fips).

Additionally, if I add the geometry and idexing in posgis (once it has been read into GRASS and is
therefor connected to the GRASS vector with bd.connect, and then try and copy or rename the vector
with g.copy or g.rename, I get the following error:

g.rename vect=mn_btvegpt_byspp,mn_btvegpt

RENAME [mn_btvegpt_byspp] to [mn_btvegpt]
WARNING: pg driver: column 'the_geometry', type 16651 is not supported
WARNING: pg driver: column 'the_geometry', type 16651 is not supported

-------------------------------------------- Managed by Request Tracker

On Thu, February 9, 2006 20:11, Request Tracker wrote:

this bug's URL: http://intevation.de/rt/webrt?serial_num=4075
-------------------------------------------------------------------------

Subject: grass and postgis issue

Platform: Mac OSX
grass obtained from: Trento Italy site
grass binary for platform: Compiled from Sources
GRASS Version: cvs 2006

I am working with postgis tables and GRASS and have found what may be a
bug.

If I try and read a postgis table (with geometry columns and spatial
indexing into GRASS, I get a new
vector but the associated postgis table has no geometry or spatial
indexing.

GRASS doesn't need the geometry since it is stored in the GRASS format, so
why should it keep this column. What is it you want to do exactly: import
a postgis map into GRASS, or view a postgis map with GRASS ? See
v.external it it is the latter.

As for indexing, this obviously first of all depends on the database
backend that you use for GRASS maps (e.g. the dbf backend wouldn't support
indexing). You seem to use postgresql as backend, so indexing would be
possible. However, most of the indexes are on the geometry column, so they
would not be needed for a map in GRASS format. Why do you want these
indexes ?

Moritz

For exmaple:

postgis output from Tyler Ms postgis example in "Web Mapping":

project1=# \d countyp020
                                   Table "public.countyp020"
   Column | Type | Modifiers
------------+-----------------------
+----------------------------------------------------------
gid | integer | not null default
nextval('countyp020_gid_seq'::regclass)
area | double precision |
perimeter | double precision |
countyp020 | bigint |
state | character varying(2) |
county | character varying(50) |
fips | character varying(5) |
state_fips | character varying(2) |
square_mil | double precision |
the_geom | geometry |
Indexes:
    "countyp020_pkey" PRIMARY KEY, btree (gid)
    "idx_countyp020_geo" gist (the_geom)
    "idx_countyp020_gid" btree (gid)
Check constraints:
    "enforce_dims_the_geom" CHECK (ndims(the_geom) = 2)
    "enforce_geotype_the_geom" CHECK (geometrytype(the_geom) =
'MULTIPOLYGON'::text OR
the_geom IS NULL)
    "enforce_srid_the_geom" CHECK (srid(the_geom) = -1)

However, reading the table back into GRASS with either pgsql2shp, then
v.in.ogr on the shapefile, or
directly into GRASS with v.in.ogr on the postgis table, results in a new
postgis table that looks like
this:

project1=# \d countyp020_postgis
        Table "public.countyp020_postgis"
   Column | Type | Modifiers
------------+------------------------+-----------
cat | integer |
gid | integer |
area | double precision |
perimeter | double precision |
countyp020 | integer |
state | character varying(255) |
county | character varying(255) |
fips | character varying(255) |
state_fips | character varying(255) |
square_mil | double precision |
Indexes:
    "countyp020_postgis_cat" UNIQUE, btree (cat)

or:

project1=# \d countyp020_grass
        Table "public.countyp020_grass"
   Column | Type | Modifiers
------------+-----------------------+-----------
cat | integer |
area | double precision |
perimeter | double precision |
countyp020 | double precision |
state | character varying(2) |
county | character varying(50) |
fips | character varying(5) |
state_fips | character varying(2) |
square_mil | double precision |
Indexes:
    "countyp020_grass_cat" UNIQUE, btree (cat)

Notice the lack of geometry and idexes in both posgis tables. Also, the
difference in type lengths
(character 5 vs 250 for fips).

Additionally, if I add the geometry and idexing in posgis (once it has
been read into GRASS and is
therefor connected to the GRASS vector with bd.connect, and then try and
copy or rename the vector
with g.copy or g.rename, I get the following error:

g.rename vect=mn_btvegpt_byspp,mn_btvegpt

RENAME [mn_btvegpt_byspp] to [mn_btvegpt]
WARNING: pg driver: column 'the_geometry', type 16651 is not supported
WARNING: pg driver: column 'the_geometry', type 16651 is not supported

-------------------------------------------- Managed by Request Tracker

_______________________________________________
grass5 mailing list
grass5@grass.itc.it
http://grass.itc.it/mailman/listinfo/grass5

On Feb 9, 2006, at 1:59 PM, Moritz Lennert wrote:

GRASS doesn’t need the geometry since it is stored in the GRASS format, so

why should it keep this column. What is it you want to do exactly: import

a postgis map into GRASS, or view a postgis map with GRASS ? See

v.external it it is the latter.

As for indexing, this obviously first of all depends on the database

backend that you use for GRASS maps (e.g. the dbf backend wouldn’t support

indexing). You seem to use postgresql as backend, so indexing would be

possible. However, most of the indexes are on the geometry column, so they

would not be needed for a map in GRASS format. Why do you want these

indexes ?

Moritz

Thanks for the reply Moritz. This all started out as a “can I speed this querry up” exercise. Perhaps the fundamental issue is my not completely understanding how GRASS makes use of the postgis extension to postgresql. You are correct in that I am using psotgresql for database management. I have an attribute table of little over a million lines of data in a postgres table that is associated with a grass vector. The table contains point data based on simple xy utm coordinates.

This allows me to run a query like:

d.vect vector_name size=2 color=blue where=“species=‘WP’ AND diam>40”

I would guess that it takes d.vect about a minute to draw the results of this query. I have never been able to quite figure out how GRASS works with the postgis extension. All I was trying to do was speed things up a bit, so I added the postgis geometry and spatial indexing to the posgresql table. When I discovered that GRASS then didn’t behave as I expected (g.copy, g.rename), I thought that there were three possible explanations: 1. a bug, 2. I didn’t have postgis configured properly, 3. (most likely) user error, or user not understanding how something was supposed to work.

Any enlightenment in that last area is much appreciated.

Kirk

Hi Kirk,

On Thu, February 9, 2006 21:23, Kirk R. Wythers wrote:

On Feb 9, 2006, at 1:59 PM, Moritz Lennert wrote:

GRASS doesn't need the geometry since it is stored in the GRASS
format, so
why should it keep this column. What is it you want to do exactly:
import
a postgis map into GRASS, or view a postgis map with GRASS ? See
v.external it it is the latter.

As for indexing, this obviously first of all depends on the database
backend that you use for GRASS maps (e.g. the dbf backend wouldn't
support
indexing). You seem to use postgresql as backend, so indexing would be
possible. However, most of the indexes are on the geometry column,
so they
would not be needed for a map in GRASS format. Why do you want these
indexes ?

Moritz

Thanks for the reply Moritz. This all started out as a "can I speed
this querry up" exercise. Perhaps the fundamental issue is my not
completely understanding how GRASS makes use of the postgis extension
to postgresql. You are correct in that I am using psotgresql for
database management. I have an attribute table of little over a
million lines of data in a postgres table that is associated with a
grass vector. The table contains point data based on simple xy utm
coordinates.

This allows me to run a query like:

d.vect vector_name size=2 color=blue where="species='WP' AND diam>40"

I would guess that it takes d.vect about a minute to draw the results
of this query. I have never been able to quite figure out how GRASS
works with the postgis extension. All I was trying to do was speed
things up a bit, so I added the postgis geometry and spatial indexing
to the posgresql table. When I discovered that GRASS then didn't
behave as I expected (g.copy, g.rename), I thought that there were
three possible explanations: 1. a bug, 2. I didn't have postgis
configured properly, 3. (most likely) user error, or user not
understanding how something was supposed to work.

Any enlightenment in that last area is much appreciated.

I will probably be stating the obvious here , but just for the archives:
Your vector data can be stored in different formats: internal GRASS
format, postgis, shape file, etc. GRASS can use these formats in two
different ways: either you import a non-GRASS format file into GRASS
format (v.in.ogr) or you can view it with v.external, which doesn't import
the file, but makes it usable for grass commands.

So, if you import into GRASS vector format, the geometry is stored in the
GRASS format, even if the data is in a postgres table (geometry and data
are handled separately in GRASS. This means that it is no longer a postgis
vector layer, but a GRASS vector layer. Postgis columns in the table or
spatial indices will have no impact as the geometry is not in postgres.
So, adding the geometry column and indices to your postgres data table
will not speed up the process of displaying the map, as d.vect only calls
upon the geometry which is stored in the internal GRASS vector format.

In your case, it would be interesting to see whether the delay of drawing
the results of your d.vect command is due to the geometry, or to the where
clause which doesn't touch upon geometry. The first thing I would try is
to create indices (with the command CREATE INDEX in psql) within your
postgres data table on the columns you query for, i.e. species and diam
(and possibly the combination of both if you will often query for the two
together).

If this doesn't help, then it might be d.vect which is "slow" (a million
points is a large amount of data...) and in order to speed things up it
would be necessary to check whether there is any way to optimise d.vect.
But this is far out of my field of competence.

Hope this helps,

Moritz

Hi Moritz,

On Feb 9, 2006, at 2:50 PM, Moritz Lennert wrote:

So, if you import into GRASS vector format, the geometry is stored in the

GRASS format, even if the data is in a postgres table (geometry and data

are handled separately in GRASS. This means that it is no longer a postgis

vector layer, but a GRASS vector layer. Postgis columns in the table or

spatial indices will have no impact as the geometry is not in postgres.

This may be a silly question… but what then, is the point of building GRASS with postgis support? If postgis is really only an extension that adds geometry column and spatial indexing options to postgresql, and GRASS does not use the posgis geometry, then why bother with postgis?

Perhaps the answer to my question is that I do see a speed up with spatial queries run from GRASS through db.execute (or when I run them on the database with postgreql). You were right btw Moritz, the slow response to the command d.vect seems to be at due to the time it takes d.vect to ‘draw’. The same query run in postgis is quite fast. Also, if I make the query more selective so that fewer data points get returned. d.vect returns faster (I think because there is less to draw). Does that make sense to you?

Kirk

Kirk R. Wythers wrote:

Hi Moritz,

On Feb 9, 2006, at 2:50 PM, Moritz Lennert wrote:

So, if you import into GRASS vector format, the geometry is stored in the

GRASS format, even if the data is in a postgres table (geometry and data

are handled separately in GRASS. This means that it is no longer a postgis

vector layer, but a GRASS vector layer. Postgis columns in the table or

spatial indices will have no impact as the geometry is not in postgres.

This may be a silly question... but what then, is the point of building GRASS with postgis support? If postgis is really only an extension that adds geometry column and spatial indexing options to postgresql, and GRASS does not use the posgis geometry, then why bother with postgis?

What do you mean by building "with postgis support" ? Postgis is supported via ogr. Postgresql support, however, is necessary to be able to use postgresql tables as database backend.

I think you shouldn't forget that GRASS is not a cartography program but a full GIS with its own geometry format and data handling. So, using postgis layers in GRASS doesn't really make sense, unless it is for the occasional situation where you need to access.

One disadvantage of the postgis format is that it obliges you to install a postgresql server...The standard format in GRASS (with its own geometry format plus dbf and possibly sqlite as database backend) does not impose the need for a full-fledged database server.

Perhaps the answer to my question is that I do see a speed up with spatial queries run from GRASS through db.execute (or when I run them on the database with postgreql).

What exactly do you mean by spatial queries ? The query you mentioned before (i.e. d.vect vector_name size=2 color=blue where="species='WP' AND diam>40") is not a "spatial query" in my defintion, but a simple query over the data. A spatial query would imply things like "all points within a certain distance of X", or "all points within area Y", etc.

You were right btw Moritz, the slow response to the command d.vect seems to be at due to the time it takes d.vect to 'draw'.

Does this mean that you created indices on the variables you query for in your postgres data table linked to the GRASS vector layer (such as species and diam in the above query) and this doesn't make a difference ?

The same query run in postgis is quite fast.

Actually you mean postgresql, since the above query doesn't use any postgis features if I am not mistaking.

In addition, you did not display a map, or ?
How does running your query through db.select or v.db.select compare to postgresql ? That shouldn't be any different unless the GRASS postgresql introduces some slowing effects.

Also, if I make the query more selective so that fewer data points get returned. d.vect returns faster (I think because there is less to draw). Does that make sense to you?

Obviously displaying less features will take less time. But it might also be that the query runs quicker. Try to compare d.vect with v.db.select and see.

BTW, I don't think that this should be considered a bug. So the bug report can be closed and we should continue this discussion only on the list.

Moritz

On Feb 10, 2006, at 9:42 AM, Moritz Lennert wrote:

What do you mean by building "with postgis support" ? Postgis is supported via ogr. Postgresql support, however, is necessary to be able to use postgresql tables as database backend.

For some reason I was thinking that postgis was a config option on GRASS (like postgres or mysql). I was wrong. I need to start thinking about postgis as simply the extension (that it is) to postgres.

I think you shouldn't forget that GRASS is not a cartography program but a full GIS with its own geometry format and data handling. So, using postgis layers in GRASS doesn't really make sense, unless it is for the occasional situation where you need to access.

One disadvantage of the postgis format is that it obliges you to install a postgresql server...

That is correct. But once the server is up and running, it seems to be quite trouble free. I have not looked closely at sqlite. I know that others here use sqlite and perhaps that the way to go in the future. The choice of postgres had more to do with wanting more extensive sql options than are available in dbf (additionaly, I already had it compiled and running on my machine). Also I find pgadmin3 a nice gui option for atrribute editing.

The standard format in GRASS (with its own geometry format plus dbf and possibly sqlite as database backend) does not impose the need for a full-fledged database server.

Perhaps the answer to my question is that I do see a speed up with spatial queries run from GRASS through db.execute (or when I run them on the database with postgreql).

What exactly do you mean by spatial queries ?The query you mentioned before (i.e. d.vect vector_name size=2 color=blue where="species='WP' AND diam>40") is not a "spatial query" in my defintion, but a simple query over the data. A spatial query would imply things like "all points within a certain distance of X", or "all points within area Y", etc.

You were right btw Moritz, the slow response to the command d.vect seems to be at due to the time it takes d.vect to 'draw'.

Does this mean that you created indices on the variables you query for in your postgres data table linked to the GRASS vector layer (such as species and diam in the above query)

correct

and this doesn't make a difference ?

not much of a difference. It is noticeable, but more so when the query is run in postgres (without the drawing overhead of d.vect). I would guess that it takes postgres 3 seconds to return the where="species='WP' AND diam>40 query without in indexing (and instantaneously with indexing), whereas it takes d.vect about 40 seconds to display the results of the query on a map without indexing and maybe 35 seconds with indexing.

The same query run in postgis is quite fast.

Actually you mean postgresql, since the above query doesn't use any postgis features if I am not mistaking.

In addition, you did not display a map, or ?
How does running your query through db.select or v.db.select compare to postgresql ? That shouldn't be any different unless the GRASS postgresql introduces some slowing effects.

You're right.. running the query where="species='WP' AND diam>40 from v.db.select seems to be about the same speed as running the query from postgres

Also, if I make the query more selective so that fewer data points get returned. d.vect returns faster (I think because there is less to draw). Does that make sense to you?

Obviously displaying less features will take less time. But it might also be that the query runs quicker. Try to compare d.vect with v.db.select and see.

BTW, I don't think that this should be considered a bug. So the bug report can be closed and we should continue this discussion only on the list.

I agree. Thanks for all comments Moritz. I understand the role of postgis vs. postgresql and how they interacts with GRASS much better.

Moritz

_______________________________________________
grass5 mailing list
grass5@grass.itc.it
http://grass.itc.it/mailman/listinfo/grass5