[GRASS-user] v.in.ascii with undefined columns?

Dear all

I want to import multiple csv-files with coordinates (points) that come from Excel/SPSS, having first row as header. These have undefined labels of columns that need to be available in grass and that might change in future. Running grass on dbf-backend, I used to create shapefiles in R and import these with v.in.ogr. However, this does not seem to be appropriate, when running grass on sqlite (abbreviation of columnnames). There seem to be different solutions, but I am what is the recommended approach. Hopefully you can give me a hint.

Thanks,
Patrick

##Current ideas:
#Version 0
create shapefile in R (http://gis.stackexchange.com/questions/30785/how-to-stop-writeogr-from-abbreviating-field-names-when-using-esri-shapefile-d)
import shapefile with v.in.ogr

#Version1
query csv and create file “coldef” with Column definition in SQL style (in shell or R?)
import in grass with v.in.ascii format=point skip=1 columns=coldef

#Version2
create sqlite-file in R/OGR with SPATIALITE-extension (see https://stat.ethz.ch/pipermail/r-sig-geo/2014-July/021313.html)
import in grass with v.in.ogr (this post mentions problems http://gis.stackexchange.com/questions/1733/how-to-import-spatialite-data-into-grass)

On Thu, Aug 6, 2015 at 10:04 AM, patrick s. <patrick_gis@gmx.net> wrote:

Dear all

I want to import multiple csv-files with coordinates (points) that come from
Excel/SPSS, having first row as header. These have undefined labels of
columns

Can you please explain a bit more "undefined labels of columns"?
Is it that you don't know the column type? Or the name of them?

If the former, you can use a .csvt file in case you import from CSV.
For an example, see
https://grass.osgeo.org/grass70/manuals/db.in.ogr.html#import-csv-file

that need to be available in grass and that might change in future.
Running grass on dbf-backend,

As mentioned earlier: when it comes to real attribute data processing,
better use the SQLite backend:
faster, long column names etc.

I used to create shapefiles in R and import
these with v.in.ogr. However, this does not seem to be appropriate, when
running grass on sqlite (abbreviation of columnnames).

Note that the SHAPE format already abbreviated it!

You may consider the Spatialite format for exchange. I just proposed
it as best option for the Rgrass7 interface (in the grass-stats
mailing list).

There seem to be
different solutions, but I am what is the recommended approach. Hopefully
you can give me a hint.

Thanks,
Patrick

##Current ideas:
#Version 0
create shapefile in R
(http://gis.stackexchange.com/questions/30785/how-to-stop-writeogr-from-abbreviating-field-names-when-using-esri-shapefile-d)
import shapefile with v.in.ogr

--> since this goes "through" DBF then, the column name cutting has happened.

#Version1
query csv and create file "coldef" with Column definition in SQL style (in
shell or R?)

...even in shell, see the db.in.ogr manual page for an example or the
CSV driver page of OGR.

import in grass with v.in.ascii format=point skip=1 columns=coldef

Yes.

#Version2
create sqlite-file in R/OGR with SPATIALITE-extension (see
https://stat.ethz.ch/pipermail/r-sig-geo/2014-July/021313.html)
import in grass with v.in.ogr

Yes, best solution I think.

(this post mentions problems
http://gis.stackexchange.com/questions/1733/how-to-import-spatialite-data-into-grass)

... that posting is from 2010 - a lot has happened since then :slight_smile:

Markus

Thanks, Markus- Coming back on your answer(s) after my vacation:

Apparently my question was a bit unclear, so I will clarify the problem even if you answered it. Maybe this can help other users:
I often find myself with tables in form of txt-files/csv-files that have two columns defining coordinates. These files have multiple additional columns with labels and datatypes (content) that are not known. Such tables can easily be loaded in many programs as OpenOffice, QGIS (csv-importer) and R, so I was searching for a simple way to import them to GRASS.
However, as I understand now, these other programs include routines to estimate the data-type and query the column-names, while GRASS explicitly demands a definition on labels and type of the user (BTW- similar to PostgreSQL). In consequence the (maybe not best but) fastest approach to load such data, is to transform them outside GRASS into a spatial format that can directly be loaded as spatial points by GRASS. Shapefile used to be such a format in the past, but has important drawbacks, e.g. the abbreviation of column-labels. The recommended approach is SQLITE, instead.

As Alternative the db.in.ogr-command allows to load csv-files into GRASS, but misses the option to create spatial points out of the coordinates. Furthermore this might need guidance on the data-type through a .csvt-file (see manual db.in.ogr).

Patrick

On 11.08.2015 23:13, Markus Neteler wrote:

On Thu, Aug 6, 2015 at 10:04 AM, patrick s. <patrick_gis@gmx.net> wrote:

Dear all

I want to import multiple csv-files with coordinates (points) that come from
Excel/SPSS, having first row as header. These have undefined labels of
columns

Can you please explain a bit more "undefined labels of columns"?
Is it that you don't know the column type? Or the name of them?

If the former, you can use a .csvt file in case you import from CSV.
For an example, see
https://grass.osgeo.org/grass70/manuals/db.in.ogr.html#import-csv-file

that need to be available in grass and that might change in future.
Running grass on dbf-backend,

As mentioned earlier: when it comes to real attribute data processing,
better use the SQLite backend:
faster, long column names etc.

I used to create shapefiles in R and import
these with v.in.ogr. However, this does not seem to be appropriate, when
running grass on sqlite (abbreviation of columnnames).

Note that the SHAPE format already abbreviated it!

You may consider the Spatialite format for exchange. I just proposed
it as best option for the Rgrass7 interface (in the grass-stats
mailing list).

There seem to be
different solutions, but I am what is the recommended approach. Hopefully
you can give me a hint.

Thanks,
Patrick

##Current ideas:
#Version 0
create shapefile in R
(http://gis.stackexchange.com/questions/30785/how-to-stop-writeogr-from-abbreviating-field-names-when-using-esri-shapefile-d)
import shapefile with v.in.ogr

--> since this goes "through" DBF then, the column name cutting has happened.

#Version1
query csv and create file "coldef" with Column definition in SQL style (in
shell or R?)

...even in shell, see the db.in.ogr manual page for an example or the
CSV driver page of OGR.

import in grass with v.in.ascii format=point skip=1 columns=coldef

Yes.

#Version2
create sqlite-file in R/OGR with SPATIALITE-extension (see
https://stat.ethz.ch/pipermail/r-sig-geo/2014-July/021313.html)
import in grass with v.in.ogr

Yes, best solution I think.

(this post mentions problems
http://gis.stackexchange.com/questions/1733/how-to-import-spatialite-data-into-grass)

... that posting is from 2010 - a lot has happened since then :slight_smile:

Markus

Hi Patrick,

another quite nice spatial format as a wrapper around a text / csv file is
the GDAL's virtual vector vrt.

there you can define easily column names, column types etc

have a look in the addon v.In.gbif where I use vrt to import GBIF csv data.

-----
best regards
Helmut
--
View this message in context: http://osgeo-org.1560.x6.nabble.com/v-in-ascii-with-undefined-columns-tp5218726p5222568.html
Sent from the Grass - Users mailing list archive at Nabble.com.

Le Thu, 3 Sep 2015 16:33:58 +0200,
"patrick s." <patrick_gis@gmx.net> a écrit :

As Alternative the db.in.ogr-command allows to load csv-files into
GRASS, but misses the option to create spatial points out of the
coordinates.

Check out v.in.db for that step.

Furthermore this might need guidance on the data-type
through a .csvt-file (see manual db.in.ogr).

If you want to make sure you get the correct datatypes then this is a
necessary step (or the equivalent columns= parameter of v.in.ascii).
Most tools that try to guess the datatype might do a reasonably good
job, but almost never 100% correct.

Once you have a .csvt file.

Actually very recent GDAL (2.1) allows you to directly ask for
automatic type definition and specify possible names for coordinate
columns (see [1]). You can then use an ogr2ogr one-liner to translate
into a format that GRASS can import without losing this info. (Ideally
v.in.ogr/r.in.gdal should allow the specification of gdal open options
just as v.out.ogr/r.out.gdal allow to specify layer creation options.
Probably worth a wish in trac.

Moritz

Le Fri, 4 Sep 2015 15:34:01 +0200,
Moritz Lennert <mlennert@club.worldonline.be> a écrit :

Le Thu, 3 Sep 2015 16:33:58 +0200,
"patrick s." <patrick_gis@gmx.net> a écrit :

> As Alternative the db.in.ogr-command allows to load csv-files into
> GRASS, but misses the option to create spatial points out of the
> coordinates.

Check out v.in.db for that step.

> Furthermore this might need guidance on the data-type
> through a .csvt-file (see manual db.in.ogr).

If you want to make sure you get the correct datatypes then this is a
necessary step (or the equivalent columns= parameter of v.in.ascii).
Most tools that try to guess the datatype might do a reasonably good
job, but almost never 100% correct.

Once you have a .csvt file.

Actually very recent GDAL (2.1) allows you to directly ask for
automatic type definition and specify possible names for coordinate
columns (see [1]). You can then use an ogr2ogr one-liner to translate
into a format that GRASS can import without losing this info. (Ideally
v.in.ogr/r.in.gdal should allow the specification of gdal open options
just as v.out.ogr/r.out.gdal allow to specify layer creation options.
Probably worth a wish in trac.

Moritz

Forgot the link:

[1] http://www.gdal.org/drv_csv.html