[GRASSLIST:914] Handling millions of points in GRASS...

I have a bathymetry project that is attempting to collect and map every sounding ever collected for Puget Sound, Washington. So far, I have several million soundings stored in a MySQL database. But with the new Sonar aboard our vessel, the new interest in bathymetric LIDAR and cooperation with NOAA, that number is expected to grow exponentially in the next few years.

What is the best way to get this data into GRASS? What's the best way to manage it?

I know that RDBMS support is coming soon and that (apparently) PostGIS is already functional on some platforms (Linux, Cygwin?). But what is the word from the trenches? I have been very happy with MySQL and would rather use this if functionality is comparable to PostGres, but it would be better to move now rather than wait if necessary...

A major task that needs to be handled is some way to select points both by spatial location (polygons not boxes) and by attributes such as age, survey, and platform. Currently, I have a table of soundings (x,y,z,ID) and a table of headers (ID,date,platform,etc.) and I load it into ArcGIS (ESRI, inc.) through ODBC where I can grid up the data into rasters for further analysis. Hence, I am currently dependent on Windows.

Thanks for your thoughts.

--
David Finlayson
School of Oceanography
Box 357940
University of Washington
Seattle, WA 98195-7940
USA

Office: Marine Sciences Building, Room 112
Phone: (206) 616-9407
Web: http://students.washington.edu/dfinlays

David,

On grass 5.0 do you not need postgis, only the postgres is necessary.
The standard postgresql have some geometry types already (point,
polygon, line,..) and operators for it (inside, overlap,..). Conversion
between pair of floats (x, y) and point type is easy. With the grass
module 'd.site.pg' you can load the points specified by a sql query and
then interpolate it. Areas and lines can be loaded from grass to
postgres by the module 'v.to.pg' and then used in sql queries.

I think that grass-odbc-mysql does not have those features.

Sandro.

On Wed, 2003-08-06 at 13:23, David Finlayson wrote:

I have a bathymetry project that is attempting to collect and map every
sounding ever collected for Puget Sound, Washington. So far, I have
several million soundings stored in a MySQL database. But with the new
Sonar aboard our vessel, the new interest in bathymetric LIDAR and
cooperation with NOAA, that number is expected to grow exponentially in
the next few years.

What is the best way to get this data into GRASS? What's the best way
to manage it?

I know that RDBMS support is coming soon and that (apparently) PostGIS
is already functional on some platforms (Linux, Cygwin?). But what is
the word from the trenches? I have been very happy with MySQL and would
rather use this if functionality is comparable to PostGres, but it would
be better to move now rather than wait if necessary...

A major task that needs to be handled is some way to select points both
by spatial location (polygons not boxes) and by attributes such as age,
survey, and platform. Currently, I have a table of soundings (x,y,z,ID)
and a table of headers (ID,date,platform,etc.) and I load it into
ArcGIS (ESRI, inc.) through ODBC where I can grid up the data into
rasters for further analysis. Hence, I am currently dependent on Windows.

Thanks for your thoughts.

--
David Finlayson
School of Oceanography
Box 357940
University of Washington
Seattle, WA 98195-7940
USA

Office: Marine Sciences Building, Room 112
Phone: (206) 616-9407
Web: http://students.washington.edu/dfinlays

--
Sandro Klippel <sandro.klippel@talha-mar.com.br>

talha-mar projetos ambientais
Conselheiro d'Avila 190
Porto Alegre,RS 91040-450
+55 (51) 3344-7451

w w w . t a l h a - m a r . c o m . b r

On Wednesday 06 August 2003 18:23, David Finlayson wrote:

I have a bathymetry project that is attempting to collect and map every
sounding ever collected for Puget Sound, Washington. So far, I have
several million soundings stored in a MySQL database. But with the new
Sonar aboard our vessel, the new interest in bathymetric LIDAR and
cooperation with NOAA, that number is expected to grow exponentially in
the next few years.

What is the best way to get this data into GRASS? What's the best way
to manage it?

I know that RDBMS support is coming soon and that (apparently) PostGIS
is already functional on some platforms (Linux, Cygwin?). But what is
the word from the trenches? I have been very happy with MySQL and would
rather use this if functionality is comparable to PostGres, but it would
be better to move now rather than wait if necessary...

A major task that needs to be handled is some way to select points both
by spatial location (polygons not boxes) and by attributes such as age,
survey, and platform. Currently, I have a table of soundings (x,y,z,ID)
and a table of headers (ID,date,platform,etc.) and I load it into
ArcGIS (ESRI, inc.) through ODBC where I can grid up the data into
rasters for further analysis. Hence, I am currently dependent on Windows.

MySQL is not currently supported by many GIS applications. PostGIS/PostgreSQL
is supported for example by Mapserver, OGR, QGIS and GRASS.

In GRASS 5.1 is possible to use directly data from PostGIS, i.e. without conversion.
BUT, there are some limitations and at present, it is not very effective.
It works but it is slow. More work and testing has to be done on that part of library.

It is possible to create a new vector as a definition of PostGIS table(s) where
geometry (points) and attributes are stored. Such vector may be used as any
other vector in GRASS, except editing. So you can for example display, query,
extract or run v.surf.rst on that.

Problems:
- missing optional SQL condition in the definition of PostGIS vector in GRASS
  (of course, it is possible to create a view in Posgres and use it in GRASS,
   but such view is created for each query and that is very slow)
- vector contains the data specified when it was created, most of other modules
  do not support SQL query, that means to run v.surf.rst on different points
  you have to create new vector or use v.extract before.
- performance

So, we are waiting for somebody like you, who needs to use it and who will improve
the functionality :slight_smile:

To be frank, if you need to work with large data sets, forget about RDBMS and use
text files :). Last month I tried to make a query on 30000 rows in Postgres,
I had to kill that after 3 hours. OK, maybe my query was not optimized,
but perl script reading from text file did the job in 6 minutes!
If you want RDBMS + GRASS, then
psql | s.in.ascii + s.*
is probably the best you can do.

Radim

On Wed, Aug 06, 2003 at 09:23:56AM -0700, David Finlayson wrote:

I know that RDBMS support is coming soon and that (apparently) PostGIS
is already functional on some platforms (Linux, Cygwin?). But what is
the word from the trenches? I have been very happy with MySQL and would
rather use this if functionality is comparable to PostGres, but it would
be better to move now rather than wait if necessary...

A major task that needs to be handled is some way to select points both
by spatial location (polygons not boxes) and by attributes such as age,
survey, and platform. Currently, I have a table of soundings (x,y,z,ID)
and a table of headers (ID,date,platform,etc.) and I load it into
ArcGIS (ESRI, inc.) through ODBC where I can grid up the data into
rasters for further analysis. Hence, I am currently dependent on Windows.

How to hold the data depends on your usage requirements.
If you do many queries that could benefit from sofisticated indexes
you should think about using PostGIS.
Also if you need data security.

Also check out this post comparing the upcoming geographical features
of MySQL 4.1 and postgis:
http://intevation.de/pipermail/freegis-list/2003-July/001449.html
Basically MySQL
does not offer transactional security and nice index technology.

Radim and Sandro already pointed out that there are several
ways to get points out of the database into GRASS.

The more complicated approaches (like moving to PostGIS
and optimising the indexing) only make sense if you really
have the usage pattern for it. I agree with Radim that cleverly used
text files might give good results.

I worked with approximately 4 millions points of batimetric data of
southern Brazil continental margin. For performance reasons I preferred
working with GMT tools and awk scripts. But the awk scripts was used
only for filter out bad data. 'blockmean' and 'surface' grid data in few
seconds. Then I imported to grass/nviz with 'r.in.bin' for 3d
visualization.

You can use the 'gmtselect' for spatial selections.

Sandro.

On Thu, 2003-08-07 at 06:00, Bernhard Reiter wrote:

On Wed, Aug 06, 2003 at 09:23:56AM -0700, David Finlayson wrote:

> I know that RDBMS support is coming soon and that (apparently) PostGIS
> is already functional on some platforms (Linux, Cygwin?). But what is
> the word from the trenches? I have been very happy with MySQL and would
> rather use this if functionality is comparable to PostGres, but it would
> be better to move now rather than wait if necessary...
>
> A major task that needs to be handled is some way to select points both
> by spatial location (polygons not boxes) and by attributes such as age,
> survey, and platform. Currently, I have a table of soundings (x,y,z,ID)
> and a table of headers (ID,date,platform,etc.) and I load it into
> ArcGIS (ESRI, inc.) through ODBC where I can grid up the data into
> rasters for further analysis. Hence, I am currently dependent on Windows.

How to hold the data depends on your usage requirements.
If you do many queries that could benefit from sofisticated indexes
you should think about using PostGIS.
Also if you need data security.

Also check out this post comparing the upcoming geographical features
of MySQL 4.1 and postgis:
http://intevation.de/pipermail/freegis-list/2003-July/001449.html
Basically MySQL
does not offer transactional security and nice index technology.

Radim and Sandro already pointed out that there are several
ways to get points out of the database into GRASS.

The more complicated approaches (like moving to PostGIS
and optimising the indexing) only make sense if you really
have the usage pattern for it. I agree with Radim that cleverly used
text files might give good results.

--
Sandro Klippel <sandro.klippel@talha-mar.com.br>

talha-mar projetos ambientais
Conselheiro d'Avila 190
Porto Alegre,RS 91040-450
+55 (51) 3344-7451

w w w . t a l h a - m a r . c o m . b r