[GRASS5] Feature-Request: Prompt for PostGIS-connection

Stephan_Holl2 · October 1, 2003, 4:15pm

Dear Developers,

during using PostGIS in GRASS5.7 I stumbled over a clear password entry
using module v.format and PostGIS.
My Postgres-User who acts with PostGIS and GRASS has a Password and in
order to make this connection work I have to enter my passwd in
cleartext.

Would that be possible to prompt for a password, if one is needed when
trying to connect and not exporting it via gg.gisnev.
g.gisenv store=mapset set=GV_FORMAT=POSTGIS
g.gisenv store=mapset
set=GV_PGIS_DATABASE=host=192.168.0.90,dbname=postgis,user=grassuser,pa
ssword=xxx

Thanks
Stephan

--
Stephan Holl

GnuPG Key-ID: 11946A09

18:11:23 up 5:55, 2 users, load average: 0.26, 0.12, 0.12

Radim_Blazek · October 2, 2003, 9:25am

On Wednesday 01 October 2003 18:15, Stephan Holl wrote:

Dear Developers,

during using PostGIS in GRASS5.7

You are realy using PostGIS in GRASS, i.e. geometry in PostGIS, not only
attributes in Postgres? Does it work well? Can you tell us for what
kind of tasks? Server on the same computer or over the network.
Last time we tried to use that, we found it so slow, that we had to switch to
native format.

I stumbled over a clear password entry
using module v.format and PostGIS.
My Postgres-User who acts with PostGIS and GRASS has a Password and in
order to make this connection work I have to enter my passwd in
cleartext.

Would that be possible to prompt for a password, if one is needed when
trying to connect and not exporting it via gg.gisnev.
g.gisenv store=mapset set=GV_FORMAT=POSTGIS
g.gisenv store=mapset
set=GV_PGIS_DATABASE=host=192.168.0.90,dbname=postgis,user=grassuser,pa
ssword=xxx

This problem needs systematic solution. Dbmi drivers are using a hack,
if cannot connect to db, dialog is opened, and user asked for password.
Example: db/drivers/postgres/db.c.
Have you seen that? Try to put attributes to Postgres protected by password
and query a vector. That works only for one connection,
the main problem is, I think, where to store password during the session.

Radim

Stephan_Holl2 · October 2, 2003, 9:55am

Hello Radim,

At Thu, 2 Oct 2003 11:25:11 +0200 Radim Blazek wrote:

On Wednesday 01 October 2003 18:15, Stephan Holl wrote:
> Dear Developers,
>
> during using PostGIS in GRASS5.7

You are realy using PostGIS in GRASS, i.e. geometry in PostGIS, not
only attributes in Postgres? Does it work well? Can you tell us for
what kind of tasks? Server on the same computer or over the network.
Last time we tried to use that, we found it so slow, that we had to
switch to native format.

Yes, I tried using PostGIS for storing geometry and attributes in
postgres.
What I did:
v.format -> set the correct values with password for postGIS.

v.convert in=roads out=roads
it works well and importet the spearfish-roads from the old
vector-format.

In postGIS two tables were created: roads_geom and roads_cat
after that
d.vect roads
was really slow, but....
* PostGIS-Server is not on the same maschine G57 is running, but inside
a 10Mbit-LAN
* PostGIS-Server is 500Mhz-server
Maybe this is a bottleneck?

My purpose was to check, if thuban and/or mapserver can read
GRASS-imported PostGIS-data, but I failed, because geometry-type
LINESTRING is not (yet) supported by thuban.
I realised, that the geometry-column in roads_geom is only called
'geom', but imported data via shp2pgsql results in a geomtry-column
named 'the_geom'.
BTW, it would be nice to rename the 'geom'-column into 'the_geom' to use
postGIS-query's from mapserver more intuitively.

also no entry in the table geometry-columns was made... but maybe I had
not granted this table for update/insert by this postgres user to write
an entry...

This problem needs systematic solution. Dbmi drivers are using a hack,
if cannot connect to db, dialog is opened, and user asked for
password. Example: db/drivers/postgres/db.c.
Have you seen that?

No, I have not seen that. I will give it a try.

Try to put attributes to Postgres protected by
password and query a vector. That works only for one connection,
the main problem is, I think, where to store password during the
session.

I am not familiar with programming such things, but what about creating
an encrypted ENV-variable?

cheers
Stephan

--
Stephan Holl

GnuPG Key-ID: 11946A09

11:36:25 up 2:31, 1 user, load average: 0.00, 0.06, 0.10

Gino_Pirelli · October 2, 2003, 11:07am

I just started to check v.transform... but it works differently by shp2psql (I was evalueting better way to import vectr information in PostGIS)

What i sow is that v.convert dosn't manage GEOMETRY_COLUMN update that is in the OGC standard. I'ts the correct way to import vector data?

Thanks for your reply.

BeSoS Luis (www.riade.net - www.acsys.it)

Stephan Holl wrote:

Hello Radim,

At Thu, 2 Oct 2003 11:25:11 +0200 Radim Blazek wrote:

On Wednesday 01 October 2003 18:15, Stephan Holl wrote:

Dear Developers,

during using PostGIS in GRASS5.7

You are realy using PostGIS in GRASS, i.e. geometry in PostGIS, not
only attributes in Postgres? Does it work well? Can you tell us for
what kind of tasks? Server on the same computer or over the network.
Last time we tried to use that, we found it so slow, that we had to
switch to native format.

Yes, I tried using PostGIS for storing geometry and attributes in
postgres.
What I did:
v.format -> set the correct values with password for postGIS.

v.convert in=roads out=roads
it works well and importet the spearfish-roads from the old
vector-format.

In postGIS two tables were created: roads_geom and roads_cat
after that d.vect roads was really slow, but....
* PostGIS-Server is not on the same maschine G57 is running, but inside
a 10Mbit-LAN
* PostGIS-Server is 500Mhz-server
Maybe this is a bottleneck?

My purpose was to check, if thuban and/or mapserver can read
GRASS-imported PostGIS-data, but I failed, because geometry-type
LINESTRING is not (yet) supported by thuban.
I realised, that the geometry-column in roads_geom is only called
'geom', but imported data via shp2pgsql results in a geomtry-column
named 'the_geom'.
BTW, it would be nice to rename the 'geom'-column into 'the_geom' to use
postGIS-query's from mapserver more intuitively.

also no entry in the table geometry-columns was made... but maybe I had
not granted this table for update/insert by this postgres user to write
an entry...

This problem needs systematic solution. Dbmi drivers are using a hack,
if cannot connect to db, dialog is opened, and user asked for
password. Example: db/drivers/postgres/db.c.
Have you seen that?

No, I have not seen that. I will give it a try.

Try to put attributes to Postgres protected by
password and query a vector. That works only for one connection,
the main problem is, I think, where to store password during the
session.

I am not familiar with programming such things, but what about creating
an encrypted ENV-variable?

cheers Stephan

Radim_Blazek · October 3, 2003, 8:36am

On Thursday 02 October 2003 11:55, Stephan Holl wrote:

d.vect roads
was really slow, but....
* PostGIS-Server is not on the same maschine G57 is running, but inside
a 10Mbit-LAN
* PostGIS-Server is 500Mhz-server
Maybe this is a bottleneck?

No the problem is in GRASS,
1) GRASS uses random access (read line by line number), this is
fast for native format but cannot be for RDBMS, as it must send query for
each line and that is slow especially over network (send query and wait
for respons for every line). I don't know how to solve this.
2) Currently, to read one line, GRASS performs more queries:
- select type, NumPoints(), Dimension()
- BEGIN (because binary cursors must be in transaction, IIRC)
- DECLARE gcursor BINARY CURSOR FOR SELECT ASBINARY( "geom FROM table WHERE id = ...")
- FETCH ALL in gcursor
- CLOSE gcursor
- COMMIT
- SELECT field, cat FROM table WHERE id = ...
this could be reduced, probably return to WKT and parse results in GRASS,
so from 7 queries to 2, and PostGIS in GRASS will be faster in this ratio
(7/2) (?). It was my stupid idea that binary cursor is be faster

I am considering to either remove PostGIS support at all or do not officially
support in 5.8 (or how they call it).

My purpose was to check, if thuban and/or mapserver can read
GRASS-imported PostGIS-data, but I failed, because geometry-type
LINESTRING is not (yet) supported by thuban.
I realised, that the geometry-column in roads_geom is only called
'geom', but imported data via shp2pgsql results in a geomtry-column
named 'the_geom'.

That is OK, there is no rule for geometry column name, AFAIK.

BTW, it would be nice to rename the 'geom'-column into 'the_geom' to use
postGIS-query's from mapserver more intuitively.

For me, it would not be nice, 'the_' is not any information, it is only longer.

also no entry in the table geometry-columns was made... but maybe I had
not granted this table for update/insert by this postgres user to write
an entry...

Yes, GRASS should add record to geometry_columns.

Radim

Stephan_Holl2 · October 3, 2003, 9:10am

Hello Radim,

At Fri, 3 Oct 2003 10:36:57 +0200 Radim Blazek wrote:

> was really slow, but....
> * PostGIS-Server is not on the same maschine G57 is running, but
> inside a 10Mbit-LAN
> * PostGIS-Server is 500Mhz-server
> Maybe this is a bottleneck?

No the problem is in GRASS,
1) GRASS uses random access (read line by line number), this is
fast for native format but cannot be for RDBMS, as it must send query
for each line and that is slow especially over network (send query and
wait for respons for every line). I don't know how to solve this.
2) Currently, to read one line, GRASS performs more queries:
- select type, NumPoints(), Dimension()
- BEGIN (because binary cursors must be in transaction, IIRC)
- DECLARE gcursor BINARY CURSOR FOR SELECT ASBINARY( "geom FROM table
WHERE id = ...")- FETCH ALL in gcursor
- CLOSE gcursor
- COMMIT
- SELECT field, cat FROM table WHERE id = ...
this could be reduced, probably return to WKT and parse results in
GRASS, so from 7 queries to 2, and PostGIS in GRASS will be faster in
this ratio(7/2) (?). It was my stupid idea that binary cursor is be
faster

OK, thanks for clarification.

I am considering to either remove PostGIS support at all or do not
officially support in 5.8 (or how they call it).

So there are no plans to reduce queries mentioned above and parse the
results in GRASS?! I would like to see this feature here to speed up
things.
From my point of view it would be good continuing PostGIS-support in the
future.

> My purpose was to check, if thuban and/or mapserver can read
> GRASS-imported PostGIS-data, but I failed, because geometry-type
> LINESTRING is not (yet) supported by thuban.
> I realised, that the geometry-column in roads_geom is only called
> 'geom', but imported data via shp2pgsql results in a geomtry-column
> named 'the_geom'.

That is OK, there is no rule for geometry column name, AFAIK.

> BTW, it would be nice to rename the 'geom'-column into 'the_geom' to
> use postGIS-query's from mapserver more intuitively.

For me, it would not be nice, 'the_' is not any information, it is
only longer.

RIght, no more information provided with this...

> also no entry in the table geometry-columns was made... but maybe I
> had not granted this table for update/insert by this postgres user
> to write an entry...

Yes, GRASS should add record to geometry_columns.

agreed

cheers
Stephan

--
Stephan Holl

GnuPG Key-ID: 11946A09

10:59:20 up 2:33, 1 user, load average: 0.10, 0.08, 0.02

Radim_Blazek · October 3, 2003, 2:42pm

On Friday 03 October 2003 11:10, Stephan Holl wrote:

> I am considering to either remove PostGIS support at all or do not
> officially support in 5.8 (or how they call it).

So there are no plans to reduce queries mentioned above and parse the
results in GRASS?! I would like to see this feature here to speed up
things.

There are plans, but it does not have high priority. It is maybe better
to release stable version earlier but without all features. The first version was
parsing WKT, so it was already in CVS, binary cursor is my "improvement".

Radim

Martin_Pokorny · November 5, 2003, 11:32pm

On Fri, 3 Oct 2003, Radim Blazek wrote:

On Thursday 02 October 2003 11:55, Stephan Holl wrote:

  d.vect roads was really slow, but.... * PostGIS-Server is not on
  the same maschine G57 is running, but inside a 10Mbit-LAN *
  PostGIS-Server is 500Mhz-server Maybe this is a bottleneck?

No the problem is in GRASS, 1) GRASS uses random access (read line
by line number), this is fast for native format but cannot be for
RDBMS, as it must send query for each line and that is slow
especially over network (send query and wait for respons for every
line). I don't know how to solve this. 2) Currently, to read one
line, GRASS performs more queries: - select type, NumPoints(),
Dimension() - BEGIN (because binary cursors must be in transaction,
IIRC) - DECLARE gcursor BINARY CURSOR FOR SELECT ASBINARY( "geom
FROM table WHERE id = ...") - FETCH ALL in gcursor - CLOSE gcursor -
COMMIT - SELECT field, cat FROM table WHERE id = ... this could be
reduced, probably return to WKT and parse results in GRASS, so from
7 queries to 2, and PostGIS in GRASS will be faster in this ratio
(7/2) (?). It was my stupid idea that binary cursor is be faster

I am considering to either remove PostGIS support at all or do not
officially support in 5.8 (or how they call it).

I've been working on this on and off for a few weekends now, and, in a
prototype implementation, I've been able to speed up PostGIS use, at
least in some cases. My benchmark has been to use v.in.ogr to import
layers from a TIGER file directly to a postgres database (both
attributes and geometry). Before my mods, importing the "pip" layer
(points only) took about 24 minutes for 26303 points. I've been able
to get the time down to around 2 1/2 minutes with various
modifications to read_post.c, write_post.c, and open_post.c, together
with the addition of some PL/pgSQL functions, and small mods to
v.in.ogr/main.c.

Other results are below. For reference, importing the CompleteChain
layer alone previously took about 6 hours, now it's 11 minutes. The
other example, I never had the patience to complete a test, but it
would have been at least 6 hours, and now it's 25 minutes.

I plan on doing more testing, but I need some advice, too (I haven't
done much with GRASS programming before). One tactic I use to speed up
reading is to buffer the results of a single query that encompasses
many features at a time. However, I need to vary the size of the
buffer depending on whether access is largely sequential (such as when
the features are first imported) or random (such as when building
topology). My approach appears to require two, somewhat global
modifications: 1) an abstract buffer size (or "access pattern")
setting function for all db types, and 2) use of this function in
several modules, such as v.in.ogr and v.clean. One possibility is to
set the default buffer size assuming a random access pattern (i.e,
small), and explicitly set the larger buffer only in the v.in.*
modules, but there are others. I don't have a good picture of the
access patterns yet, so it's hard for me to decide. Ideas, clues or
hints, anyone?

Cheers,
Martin

(attachments)

results.txt (1.68 KB)

Radim_Blazek · November 7, 2003, 10:52am

On Thursday 06 November 2003 00:32, Martin Pokorny wrote:

>> d.vect roads was really slow, but.... * PostGIS-Server is not on
>> the same maschine G57 is running, but inside a 10Mbit-LAN *
>> PostGIS-Server is 500Mhz-server Maybe this is a bottleneck?
>
> No the problem is in GRASS, 1) GRASS uses random access (read line
> by line number), this is fast for native format but cannot be for
> RDBMS, as it must send query for each line and that is slow
> especially over network (send query and wait for respons for every
> line). I don't know how to solve this. 2) Currently, to read one
> line, GRASS performs more queries: - select type, NumPoints(),
> Dimension() - BEGIN (because binary cursors must be in transaction,
> IIRC) - DECLARE gcursor BINARY CURSOR FOR SELECT ASBINARY( "geom
> FROM table WHERE id = ...") - FETCH ALL in gcursor - CLOSE gcursor -
> COMMIT - SELECT field, cat FROM table WHERE id = ... this could be
> reduced, probably return to WKT and parse results in GRASS, so from
> 7 queries to 2, and PostGIS in GRASS will be faster in this ratio
> (7/2) (?). It was my stupid idea that binary cursor is be faster
>
> I am considering to either remove PostGIS support at all or do not
> officially support in 5.8 (or how they call it).

I've been working on this on and off for a few weekends now, and, in a
prototype implementation, I've been able to speed up PostGIS use, at
least in some cases. My benchmark has been to use v.in.ogr to import
layers from a TIGER file directly to a postgres database (both
attributes and geometry). Before my mods, importing the "pip" layer
(points only) took about 24 minutes for 26303 points. I've been able
to get the time down to around 2 1/2 minutes with various
modifications to read_post.c, write_post.c, and open_post.c, together
with the addition of some PL/pgSQL functions, and small mods to
v.in.ogr/main.c.

Other results are below. For reference, importing the CompleteChain
layer alone previously took about 6 hours, now it's 11 minutes. The
other example, I never had the patience to complete a test, but it
would have been at least 6 hours, and now it's 25 minutes.

I plan on doing more testing, but I need some advice, too (I haven't
done much with GRASS programming before). One tactic I use to speed up
reading is to buffer the results of a single query that encompasses
many features at a time. However, I need to vary the size of the
buffer depending on whether access is largely sequential (such as when
the features are first imported) or random (such as when building
topology). My approach appears to require two, somewhat global
modifications: 1) an abstract buffer size (or "access pattern")
setting function for all db types, and 2) use of this function in
several modules, such as v.in.ogr and v.clean. One possibility is to
set the default buffer size assuming a random access pattern (i.e,
small), and explicitly set the larger buffer only in the v.in.*
modules, but there are others. I don't have a good picture of the
access patterns yet, so it's hard for me to decide. Ideas, clues or
hints, anyone?

It would be nice to compare also PostGIS contra native,
perferably over the network for both. I think that it is better
to do benchmarks without attributes (v.in.ogr -t)

What is your 'buffer'? Is it sequence of queries executed
as one transaction?
I think, that almost everything may be done as one transaction,
e.g. whole v.in.ogr.

PL/pgSQL is something what I want to avoid. PL/pgSQL means that another
step must be done before GRASS can be used with PostGIS.
What I realy worry about is, that once we use PL/pgSQL functions, it must be
compatible for ever.

Speed is not the only problem of PostGIS in GRASS!:
1) Vect_write_line, Vect_rewrite_line, Vect_delete_line are missing
   for level 2.
   BTW!, you can run v.in.ogr (with cleaning) or v.clean on PostGIS vectors?
   It should be impossible!
2) No tests were done for simultaneous editing. There are certainly many
   potential problems in this area.

Are there another reasons to use PostGIS in GRASS except multiuser simultaneous
write access? I don't think so.

Can you put your modifications somewhere on the Web?

As I said, I don't plan to spend any time on this before 5.7.0.

Radim

Martin_Pokorny · November 7, 2003, 5:46pm

On Fri, 7 Nov 2003, Radim Blazek wrote:

I plan on doing more testing, but I need some advice, too (I
haven't done much with GRASS programming before). One tactic I use
to speed up reading is to buffer the results of a single query that
encompasses many features at a time. However, I need to vary the
size of the buffer depending on whether access is largely
sequential (such as when the features are first imported) or random
(such as when building topology). My approach appears to require
two, somewhat global modifications: 1) an abstract buffer size (or
"access pattern") setting function for all db types, and 2) use of
this function in several modules, such as v.in.ogr and v.clean. One
possibility is to set the default buffer size assuming a random
access pattern (i.e, small), and explicitly set the larger buffer
only in the v.in.* modules, but there are others. I don't have a
good picture of the access patterns yet, so it's hard for me to
decide. Ideas, clues or hints, anyone?

It would be nice to compare also PostGIS contra native,
perferably over the network for both. I think that it is better
to do benchmarks without attributes (v.in.ogr -t)

I agree, but my feeling is that native is still significantly faster.
Testing over the network is a problem for me.

What is your 'buffer'? Is it sequence of queries executed
as one transaction?
I think, that almost everything may be done as one transaction,
e.g. whole v.in.ogr.

I figured out yesterday that the buffer issue was a red herring. That
technique really contributes very little to the speedup. Transactions
were already implemented, and I didn't change that.

PL/pgSQL is something what I want to avoid. PL/pgSQL means that
another step must be done before GRASS can be used with PostGIS.
What I realy worry about is, that once we use PL/pgSQL functions, it
must be compatible for ever.

Concerning the additional steps being required, PL/pgSQL is already a
part of PostGIS, and the additional functions could be put into the
pg.postgisdb script. About your second point, I can see your
reservations; I suppose that unless PL/pgSQL proves essential, it
would be best to leave it out of GRASS.

I have not done enough testing yet with all the changes I made to know
for certain which are effective and which aren't. It's possible that
leaving out the PL/pgSQL functions won't have a big effect.

Speed is not the only problem of PostGIS in GRASS!: 1)
Vect_write_line, Vect_rewrite_line, Vect_delete_line are missing for
level 2. BTW!, you can run v.in.ogr (with cleaning) or v.clean on
PostGIS vectors? It should be impossible!

I noticed that. I realize that it would take much more work to make
PostGIS useful in GRASS.

2) No tests were done for simultaneous editing. There are certainly
many potential problems in this area.

I'm well aware of that.

Are there another reasons to use PostGIS in GRASS except multiuser
simultaneous write access? I don't think so.

Not that I have any experience in this area, but I am under the
impression that once GRASS data is in PostGIS, it would become easier
to use from outside of GRASS (say, MapServer).

This exercise has shown me that for GRASS to better exploit PostGIS
(for example, using more efficient queries) would take considerable
effort. I'm not sure whether that would be possible without making
deeper changes in GRASS than just the *_post modules.

Can you put your modifications somewhere on the Web?

I'll let you know when they're ready.

--
Martin

Radim_Blazek · November 10, 2003, 9:38am

On Friday 07 November 2003 18:46, Martin Pokorny wrote:

Transactions were already implemented, and I didn't change that.

And that is the problem I think. Try just to take original code and
remove BEGIN and COMMIT from V1_write_line_post (write_post.c),
insert BEGIN to V1_open_new_post (open_post.c) and
insert COMMIT to V1_close_post (close_post.c).
See also http://article.gmane.org/gmane.comp.gis.grass.devel/2130

Not that I have any experience in this area, but I am under the
impression that once GRASS data is in PostGIS, it would become easier
to use from outside of GRASS (say, MapServer).

Yes, but it works only for points and lines and a view must be created
to get everything (geometry + attributes) in one table.
Any Web GIS application (MapServer) is usually a lot of work and
it is using limited number of layers, to do v.out.ogr for those
few layres is not problem, vector data are not big.
It is bigger problem for rasters, which are huge and cannot be effectively
shared by GRASS and Mapserver. I am not sure if using PROCESSING "BANDS=1,2,3"
and grasslib can solve this partialy.

Radim