[GRASS-user] v.in.ascii - put1 of 4 columns to 3D point file attribute

M_S · April 9, 2008, 12:55pm

Greetings GRASSers:

First I’d like to say what a great book the Open Source GIS: A GRASS GIS Approach (3rd edition) is. It is full of great practical and conceptual examples. Well worth the money to anyone who has not purchased it yet! Also a big thanks to the developers and users of GRASS who make it a powerful and amazing GIS.

I have an ascii table of lidar data. The columns are x,y,z, and a fourth column containing an integer code of some sort. I have looked at the example in the man page about how to select columns for importing via some shell manipulation of the input ascii file, but I do not think it is applicable to what I’m doing (or perhaps the caffeine boost is not in effect yet). I have also tried various combinations of parameters in v.in.ascii.

I would like to import the XYZ to build the geometry for 3D points, and have only “Cat” and “Code” in the attribute table. (“Code” being a column name for the 4th column in the input ASCII). I cant seem to find the combination of column definitions and other parameters to achieve this. One option seems to import them all as “x double, y double, z double, code int”, and then after the import, drop the x,y,z columns. Is it possible to put the first 3 ASCII column values to the geometry, and use only the last column as an attribute?

I’m using GRASS 6.2.3 and sqlite database.

Thanks out to anyone with some feedback.

Mark

Nikos_Alexandris1 · April 9, 2008, 1:01pm

On Wed, 2008-04-09 at 08:55 -0400, M S wrote:

Greetings GRASSers:

Greetings Mark!

I am not proposing any solution - sorry.

First I'd like to say what a great book the Open Source GIS: A GRASS
GIS Approach (3rd edition) is. It is full of great practical and
conceptual examples. Well worth the money to anyone who has not
purchased it yet! Also a big thanks to the developers and users of
GRASS who make it a powerful and amazing GIS.

Just wanted to say that it's nice to know that other people share the
same ideas about GRASS-GIS and GRASS-book.

I have an ascii table of lidar data. The columns are x,y,z, and a
fourth column containing an integer code of some sort. I have looked
at the example in the man page about how to select columns for
importing via some shell manipulation of the input ascii file, but I
do not think it is applicable to what I'm doing (or perhaps the
caffeine boost is not in effect yet). I have also tried various
combinations of parameters in v.in.ascii.

I would like to import the XYZ to build the geometry for 3D points,
and have only "Cat" and "Code" in the attribute table. ("Code" being
a column name for the 4th column in the input ASCII).

Watch out the CAPITAL letters. I think sometimes there are not liked
from SQL(ite).

I cant seem to find the combination of column definitions and other
parameters to achieve this. One option seems to import them all as "x
double, y double, z double, code int", and then after the import, drop
the x,y,z columns. Is it possible to put the first 3 ASCII column
values to the geometry, and use only the last column as an attribute?

I'm using GRASS 6.2.3 and sqlite database.

Thanks out to anyone with some feedback.

Mark

Cheers,

Nikos

_______________________________________________
grass-user mailing list
grass-user@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/grass-user

mlennert · April 9, 2008, 2:21pm

On 09/04/08 14:55, M S wrote:

I would like to import the XYZ to build the geometry for 3D points, and have only "Cat" and "Code" in the attribute table. ("Code" being a column name for the 4th column in the input ASCII). I cant seem to find the combination of column definitions and other parameters to achieve this. One option seems to import them all as "x double, y double, z double, code int", and then after the import, drop the x,y,z columns. Is it possible to put the first 3 ASCII column values to the geometry, and use only the last column as an attribute?

I'm not sure I really understand your problem. Maybe you could send us the layout of your file (head -2) and the v.in.ascii command you tried.

You need to specify the _column number_ of your X,Y and Z columns with the x=, y= and z= parameters, the _column number_ of you cat column with the cat= parameter, and the attribute columns with the columns= parameter.

So if I understand your file correctly it would be something like this:

v.in.ascii -z [...] x=1 y=2 z=3 cat=4 columns="code int"

You might also want to use the -b flag, as topology building can lead to overload if the input data file is too large.

Moritz

M_S · April 9, 2008, 3:05pm

Sorry if the problem was unclear. I would like to:

use the first 3 columns to create the 3D point geometry, but not put XYZ values into the table
use the 4th column to add a “code” attribute to the table, and populate with values from the 4th column of the ascii file
have the “cat” field automagically populated.
the desired end result is a 3D point file, with only “cat” and “code” attributes in the table.

Perhaps what I’m trying to accomplish is not necessary, but I’m trying to keep the DB table small due to the volume of points. For now I’m just working with 10 of the 5 million.

Here’s the file head, there’s no header (although it is X,Y,Z,)
480354.49,1539756.39,16.59,196

the v.in.ascii was:
v.in.ascii -z -n input=57697.txt output=lidar57697 format=point fs=, skip=0 {columns=code int} x=1 y=2 z=3 cat=0

which yeilds:
maximum input row length: 43
maximum number of columns: 4
minimum number of columns: 4
column: 1 type: double
column: 2 type: double
column: 3 type: double
column: 4 type: integer

number of columns defined (1) does not match number of columns (4) in input.

Mark

On Wed, Apr 9, 2008 at 10:21 AM, Moritz Lennert <mlennert@club.worldonline.be> wrote:

On 09/04/08 14:55, M S wrote:

I would like to import the XYZ to build the geometry for 3D points, and have only “Cat” and “Code” in the attribute table. (“Code” being a column name for the 4th column in the input ASCII). I cant seem to find the combination of column definitions and other parameters to achieve this. One option seems to import them all as “x double, y double, z double, code int”, and then after the import, drop the x,y,z columns. Is it possible to put the first 3 ASCII column values to the geometry, and use only the last column as an attribute?

I’m not sure I really understand your problem. Maybe you could send us the layout of your file (head -2) and the v.in.ascii command you tried.

You need to specify the column number of your X,Y and Z columns with the x=, y= and z= parameters, the column number of you cat column with the cat= parameter, and the attribute columns with the columns= parameter.

So if I understand your file correctly it would be something like this:

v.in.ascii -z […] x=1 y=2 z=3 cat=4 columns=“code int”

You might also want to use the -b flag, as topology building can lead to overload if the input data file is too large.

Moritz

Hamish · April 10, 2008, 6:39am

Mark wrote:

I have an ascii table of lidar data. The columns are x,y,z, and a
fourth column containing an integer code of some sort.

how many points are we talking about? millions? it the table becoming too
huge?

wc -l inputfile.txt

I have looked at the example in the man page about how to select
columns for importing via some shell manipulation of the input ascii
file, but I do not think it is applicable to what I'm doing (or perhaps
the caffeine boost is not in effect yet). I have also tried various
combinations of parameters in v.in.ascii.

I would like to import the XYZ to build the geometry for 3D points, and
have only "Cat" and "Code" in the attribute table. ("Code" being a
column name for the 4th column in the input ASCII). I cant seem to
find the combination of column definitions and other parameters to
achieve this. One option seems to import them all as "x double, y
double, z double, code int", and then after the import, drop the x,y,z
columns.

In GRASS 6.3: v.db.dropcol

For GRASS 6.2.3 you can examine that script and do the SQLite trick by
hand:
http://trac.osgeo.org/grass/browser/grass/trunk/scripts/v.db.dropcol

Is it possible to put the first 3 ASCII column values to the geometry,
and use only the last column as an attribute?

another idea is to crop the file to only include the cat and code
columns,
cut -f1,4 -d',' inputfile.txt > outputfile.txt
then use db.in.ogr (GRASS 6.3 only) to import the .csv file into SQLite,
and then use v.db.connect to connect that database to your vector.
(triple-check the cat+codes line up; is there a cat column or is that
just sequential order? if so, use "seq" + "paste" to add that to the
cut'ed code-only file)

yet another idea is to do it like v.in.garmin/v.in.gpsbabel scripts, but
that may be slow for many many points.

I'm using GRASS 6.2.3 and sqlite database.

Hamish

__________________________________________________
Do You Yahoo!?
Tired of spam? Yahoo! Mail has the best spam protection around
http://mail.yahoo.com

Hamish · April 10, 2008, 6:50am

Mark:

the v.in.ascii was:
v.in.ascii -z -n input=57697.txt output=lidar57697 format=point fs=,
skip=0 {columns=code int} x=1 y=2 z=3 cat=0

which yeilds:
maximum input row length: 43
maximum number of columns: 4
minimum number of columns: 4
column: 1 type: double
column: 2 type: double
column: 3 type: double
column: 4 type: integer

<ERROR> number of columns defined (1) does not match number of columns
(4) in input.

The columns= option is to rename columns, not to say to include them or
not. So for v.in.ascii you need to name all 4, in order:
columns='x double, y double, z double, code int'

Hamish

__________________________________________________
Do You Yahoo!?
Tired of spam? Yahoo! Mail has the best spam protection around
http://mail.yahoo.com

mlennert · April 10, 2008, 8:56am

On Thu, April 10, 2008 08:50, Hamish wrote:

Mark:

the v.in.ascii was:
v.in.ascii -z -n input=57697.txt output=lidar57697 format=point fs=,
skip=0 {columns=code int} x=1 y=2 z=3 cat=0

which yeilds:
maximum input row length: 43
maximum number of columns: 4
minimum number of columns: 4
column: 1 type: double
column: 2 type: double
column: 3 type: double
column: 4 type: integer

<ERROR> number of columns defined (1) does not match number of columns
(4) in input.

The columns= option is to rename columns, not to say to include them or
not. So for v.in.ascii you need to name all 4, in order:
columns='x double, y double, z double, code int'

Yes, sorry, the info I gave was wrong. But what happens if you use no
column= option at all ?

Moritz

M_S · April 10, 2008, 11:00am

Thanks for the various solutions.

The points are about 5 million per tile, and there are 23 tiles (about 115 million total). I’m not sure if this is considered a large amount of mass input points.

Perhaps the table size is OK. I havent reached a size problem yet, but I was trying to avoided it from the start. Only using the xyz fields seems like a good solution, and bringing in the 4th column attribute after the fact, if it is even necessary.

Thanks much,
Mark

On Thu, Apr 10, 2008 at 2:39 AM, Hamish <hamish_b@yahoo.com> wrote:

Mark wrote:

I have an ascii table of lidar data. The columns are x,y,z, and a
fourth column containing an integer code of some sort.

how many points are we talking about? millions? it the table becoming too
huge?

wc -l inputfile.txt

I have looked at the example in the man page about how to select
columns for importing via some shell manipulation of the input ascii
file, but I do not think it is applicable to what I’m doing (or perhaps
the caffeine boost is not in effect yet). I have also tried various
combinations of parameters in v.in.ascii.

I would like to import the XYZ to build the geometry for 3D points, and
have only “Cat” and “Code” in the attribute table. (“Code” being a
column name for the 4th column in the input ASCII). I cant seem to
find the combination of column definitions and other parameters to
achieve this. One option seems to import them all as “x double, y
double, z double, code int”, and then after the import, drop the x,y,z
columns.

In GRASS 6.3: v.db.dropcol

For GRASS 6.2.3 you can examine that script and do the SQLite trick by
hand:
http://trac.osgeo.org/grass/browser/grass/trunk/scripts/v.db.dropcol

Is it possible to put the first 3 ASCII column values to the geometry,
and use only the last column as an attribute?

another idea is to crop the file to only include the cat and code
columns,
cut -f1,4 -d’,’ inputfile.txt > outputfile.txt
then use db.in.ogr (GRASS 6.3 only) to import the .csv file into SQLite,
and then use v.db.connect to connect that database to your vector.
(triple-check the cat+codes line up; is there a cat column or is that
just sequential order? if so, use “seq” + “paste” to add that to the
cut’ed code-only file)

yet another idea is to do it like v.in.garmin/v.in.gpsbabel scripts, but
that may be slow for many many points.

I’m using GRASS 6.2.3 and sqlite database.

Hamish

Do You Yahoo!?
Tired of spam? Yahoo! Mail has the best spam protection around
http://mail.yahoo.com

Hamish · April 14, 2008, 5:33am

Mark wrote:

The points are about 5 million per tile, and there are 23 tiles (about
115 million total). I'm not sure if this is considered a large amount
of mass input points.

Yes, it's a lot. The current vector engine allocates a small amount of
memory per feature for topology. With the DBF as database and default
v.in.ascii settings I've only ever managed to import about 3 million
points before running out of memory. I am interested to hear that you
could load 5 million into a SQLite DB, maybe that is more efficient.

With the v.in.ascii -t and -b flags you should be able to load 25million+
points into a single vector map, but there is a limited number of modules
that will be able to use a vector map without topology. (importantly
v.surf.rst still works)

If you are willing to abandon your "code" data column, you can strip off
that column and import the rest as a 3D vector (-z z=), in which case no
table is created (or just ignore it with -t -z). If you are just
interested in the 3D coordinate, then a DB table is unneeded overhead and
is best skipped.

Question: what do you want to do with the data? Simply create a raster
DEM or do more fancy cleaning with v.lidar.*? If just creating a raster
DEM you might skip v.in.ascii altogether and use the r.in.xyz module.

Hamish

____________________________________________________________________________________
Be a better friend, newshound, and
know-it-all with Yahoo! Mobile. Try it now. http://mobile.yahoo.com/;_ylt=Ahu06i62sR8HDtDypao8Wcj9tAcJ

neteler · April 14, 2008, 11:42am

On Mon, Apr 14, 2008 at 7:33 AM, Hamish <hamish_b@yahoo.com> wrote:

Yes, it’s a lot. The current vector engine allocates a small amount of
memory per feature for topology. With the DBF as database and default
v.in.ascii settings I’ve only ever managed to import about 3 million
points before running out of memory. I am interested to hear that you
could load 5 million into a SQLite DB, maybe that is more efficient.

GRASS 6.3.0svn (patUTM):~/data/lidar_PAT_raw/raw > wc -l big.txt
13072022 big.txt

GRASS 6.3.0svn (patUTM):~/data/lidar_PAT_raw/raw > head big.txt
659936.35 5096943.37 368.58
659900.34 5096968.01 1082.09
659900.23 5096966.73 1082.84
664549.79 5097099.35 234.83
664553.50 5097099.56 234.70
664551.90 5097098.79 234.70
664550.46 5097098.11 234.98
664557.22 5097099.75 234.71
664555.68 5097099.02 235.02
664554.24 5097098.33 235.02

GRASS 6.3.0svn (patUTM):~/data/lidar_PAT_raw/raw > date ; v.in.ascii big.txt out=big -z z=3 fs=space ; date
Mon Apr 14 10:17:46 CEST 2008
Scanning input for column types…
Maximum input row length: 29
Maximum number of columns: 3
Minimum number of columns: 3
Importing points…
Building topology …
Registering lines: 11111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111101160011111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111110252001111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111038800111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111105240011111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111110660001111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111079600111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111109320011111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111068001111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111120400111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111113400011111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111476001111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111161200111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111117480011111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111884001111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111202000111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111121560011111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111112292001111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111242800111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111125640011111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111112700001111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111283600111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111129720013072022 primitives registered 11111111111111111111111111111111111111111111111111111111111111113072000
Building areas: 100%
0 areas built
0 isles built
Attaching islands:
Attaching centroids: 100%
Topology was built.
Number of nodes : 13070034
Number of primitives: 13072022
Number of points : 13072022
Number of lines : 0
Number of boundaries: 0
Number of centroids : 0
Number of areas : 0
Number of isles : 0
v.in.ascii complete.
Mon Apr 14 11:25:21 CEST 2008

It took (despite the lines print overflow above) a bit more than 1h to import
13 million points into SQLite.

Unfortunately v.univar cannot yet calculate on the x, y, or z geometry
to perform further performance tests.

Markus

PS: please use the new list email address (but redirection still works…)

M_S · April 14, 2008, 12:21pm

On Mon, Apr 14, 2008 at 1:33 AM, Hamish <hamish_b@yahoo.com> wrote:

Mark wrote:

The points are about 5 million per tile, and there are 23 tiles (about
115 million total). I’m not sure if this is considered a large amount
of mass input points.

Yes, it’s a lot. The current vector engine allocates a small amount of
memory per feature for topology. With the DBF as database and default
v.in.ascii settings I’ve only ever managed to import about 3 million
points before running out of memory. I am interested to hear that you
could load 5 million into a SQLite DB, maybe that is more efficient.

The SQlite setup was so easy (thanks to the book), and then able to start working with the tables was a breeze (literally as easy as db.connect driver=sqlite database="$GISDBASE/$LOCATION_NAME/$MAPSET/sqlite.db.) I’m not a SQL expert, but learned enough to do some sweet tasks. SQlite was rather impressive in working with 5.2 million records. Through v.in.ascii, it did load all the data (4 columns) into the sqlite database without problem (about 7 minutes on a 64bit 2.0GHz 2GB ram system - no table and no topology)

With the v.in.ascii -t and -b flags you should be able to load 25million+
points into a single vector map, but there is a limited number of modules
that will be able to use a vector map without topology. (importantly
v.surf.rst still works)

Ayone know if the v.outlier, v.lidar.edgedetect, v. lidar.grow and v.lidar.correction need topology?

If you are willing to abandon your “code” data column, you can strip off
that column and import the rest as a 3D vector (-z z=), in which case no
table is created (or just ignore it with -t -z). If you are just
interested in the 3D coordinate, then a DB table is unneeded overhead and
is best skipped.

This approach turned out to be the best. You (Hamish) were right that the code was an intensity or strength of return signal. I made a nice “intensity” grayscale map, that was great. Since I dont need the intensity, or know how to use it to help separate the DEM from DSM features, the 3D point worked perfect. Thanks!

Question: what do you want to do with the data? Simply create a raster
DEM or do more fancy cleaning with v.lidar.*? If just creating a raster
DEM you might skip v.in.ascii altogether and use the r.in.xyz module.

I will definitely look more closely at the r.in.xyz module. I had used it for some binning and statistical analysis of the point distribution of the lidar. it is pretty tight data.

I want/need to do fancy cleaning with the v.lidar.* tools. However, not having pulse return counts to work with like the example in the book, I am not sure how well it will work, because only having an intensity value, I’m not sure how that relates to first and last returns. What I would like to do with the data is create a bare earth DEM, and a DSM of the elevated surface features. Then perhaps add elevated surface features like solid objects (e.g. buildings, etc) that would block terrain for shallow overland flow or r.lake simulations for flooding.

Since the application is for watershed analysis (feature building, and ultimately running hydrology models in GRASS), so my goal is to create the most realistic hydrologic surface model to properly route the water. In theory, it seems to me that water should not route through solid features like buildings (extrude as walls?), but should be able to (and often does) route through shrubby or forested areas. In these areas, where there is canopy, it seems that those canopy elevations should be replaced with bare earth elevations, but allow water to flow through these areas. This is the first time I have worked with raw LiDAR data to create features or do watershed analysis, so this is a fun learning experience. Typically, I had worked with contour data in the past.

Mark