[GRASS-dev] Re: [GRASS-user] v.db.join script

On 10/12/2010 12:01 PM, Jon Eiriksson wrote:

Hi,

I have a truncation problem with v.db.join. This has been raised before -

([GRASS-user] Re: grass v.db.join

Gary Nobles
Fri, 12 Mar 2010 11:11:26 -0800)

- but I have not seen a solution. I have tried my own data, the
spearfish60 example data, and the example in Neteler and Mitasova's book.
The new data columns are apparently defined as 1 character long, and the
data become truncated accordingly, much against my intention. I use mysql.

I can see why this is happening. But I'm not sure what the correct solution might be. v.db.join is a wrapper around db.describe and v.db.addcol. When I query an sqlite database connection here's what the output looks like:

db.describe -c bike_rides2
ncols: 8
nrows: 17
Column 1: cat:INTEGER:20
Column 2: name:CHARACTER:80
Column 3: number:INTEGER:20
Column 4: comment:CHARACTER:80
Column 5: descriptio:CHARACTER:80
Column 6: source:CHARACTER:80
Column 7: url:CHARACTER:80

The output format is obviously:
"Column num:column name:column type:column size".

One line from the v.db.join script uses grep and cut to get the column names and the next line gets the column types like so:

db.describe -c bike_rides2 | grep '^Column' | cut -d ':' -f 3
INTEGER
CHARACTER
INTEGER
CHARACTER

So the column size is actually ignored.

Next, in the script the above output is used by v.db.addcol to create the new columns in the joined vector. So all new character columns are created as a single char and the actual length is never used.

Questions:
Is the db.describe output the same for all db drivers?
Any suggestions how to fix this as a script?
Or better, just convert to python?

Thanks,
Micha

--
Micha Silver
Arava Development Co. +972-52-3665918
http://www.surfaces.co.il

On 13/10/10 22:23, Micha Silver wrote:

One line from the v.db.join script uses grep and cut to get the column
names and the next line gets the column types like so:

db.describe -c bike_rides2 | grep '^Column' | cut -d ':' -f 3
INTEGER
CHARACTER
INTEGER
CHARACTER

So the column size is actually ignored.

Next, in the script the above output is used by v.db.addcol to create
the new columns in the joined vector. So all new character columns are
created as a single char and the actual length is never used.

Questions:
Is the db.describe output the same for all db drivers?
Any suggestions how to fix this as a script?

Why not use an SQL join, i.e. something like the following ?

1) CREATE TABLE temp AS (SELECT * FROM $maptable JOIN $otable ON $column=$ocolumn)

2) rename table $maptable to something else

3) rename table temp to $maptable

4) if this works, remove the original $maptable

Not tested, but might be a more elegant solution ?

Moritz

Moritz Lennert wrote:

On 13/10/10 22:23, Micha Silver wrote:

One line from the v.db.join script uses grep and cut to get the column
names and the next line gets the column types like so:

db.describe -c bike_rides2 | grep '^Column' | cut -d ':' -f 3
INTEGER
CHARACTER
INTEGER
CHARACTER

So the column size is actually ignored.

Next, in the script the above output is used by v.db.addcol to create
the new columns in the joined vector. So all new character columns are
created as a single char and the actual length is never used.

Questions:
Is the db.describe output the same for all db drivers?
Any suggestions how to fix this as a script?

Why not use an SQL join, i.e. something like the following ?

1) CREATE TABLE temp AS (SELECT * FROM $maptable JOIN $otable ON $column=$ocolumn)

2) rename table $maptable to something else

3) rename table temp to $maptable

4) if this works, remove the original $maptable

Interesting. So your suggestion is to run the above sql commands thru db.execute to create a new attribute table for an existing vector?

Not tested, but might be a more elegant solution ?

Moritz

This mail was received via Mail-SeCure System.

On 19/10/10 15:23, Micha Silver wrote:

Moritz Lennert wrote:

On 13/10/10 22:23, Micha Silver wrote:

One line from the v.db.join script uses grep and cut to get the column
names and the next line gets the column types like so:

db.describe -c bike_rides2 | grep '^Column' | cut -d ':' -f 3
INTEGER
CHARACTER
INTEGER
CHARACTER

So the column size is actually ignored.

Next, in the script the above output is used by v.db.addcol to create
the new columns in the joined vector. So all new character columns are
created as a single char and the actual length is never used.

Questions:
Is the db.describe output the same for all db drivers?
Any suggestions how to fix this as a script?

Why not use an SQL join, i.e. something like the following ?

1) CREATE TABLE temp AS (SELECT * FROM $maptable JOIN $otable ON
$column=$ocolumn)

2) rename table $maptable to something else

3) rename table temp to $maptable

4) if this works, remove the original $maptable

Interesting. So your suggestion is to run the above sql commands thru
db.execute to create a new attribute table for an existing vector?

Yes. You could put them all into a temporary text file then send this file as one transaction to db.execute.

Moritz

Thank you for the response,

I ran the same data with sqlite, and this is what I get with db.describe -c:

[sqlite db]
GRASS 6.4.0 (spearfish60):~ > db.describe -c mysoils
ncols: 5
nrows: 54
Column 1: cat:INTEGER:20
Column 2: label:CHARACTER:13
Column 3: id:CHARACTER:1
Column 4: shortname:CHARACTER:1
Column 5: longname:CHARACTER:1
GRASS 6.4.0 (spearfish60):~ > db.describe -c soils_legend
ncols: 3
nrows: 55
Column 1: id:CHARACTER:1
Column 2: shortname:CHARACTER:1
Column 3: longname:CHARACTER:1
GRASS 6.4.0 (spearfish60):~ >

- and there is no truncation of data after v.db.join,

while

[mysql db]

GRASS 6.4.0 (spearfish60):~ > db.describe -c mysoils
ncols: 5
nrows: 54
Column 1: cat:INTEGER:11
Column 2: label:CHARACTER:13
Column 3: id:CHARACTER:1
Column 4: shortname:CHARACTER:1
Column 5: longname:CHARACTER:1
GRASS 6.4.0 (spearfish60):~ > db.describe -c soils_legend
ncols: 3
nrows: 55
Column 1: id:CHARACTER:255
Column 2: shortname:CHARACTER:255
Column 3: longname:CHARACTER:255
GRASS 6.4.0 (spearfish60):~ >

- truncates the data in joined columns

I do not have a suggestion for an improved script - not up to that yet. The mysql db seems to be more strict as to the column variable length definition. As you point out, the definition is ignored and data carried over differently in sqlite and mysql. Hope that someone can take this up.

Thanks, Jon

On 13.10.2010, at 20:23, Micha Silver wrote:

On 10/12/2010 12:01 PM, Jon Eiriksson wrote:

Hi,

I have a truncation problem with v.db.join. This has been raised before -

([GRASS-user] Re: grass v.db.join

Gary Nobles
Fri, 12 Mar 2010 11:11:26 -0800)

- but I have not seen a solution. I have tried my own data, the
spearfish60 example data, and the example in Neteler and Mitasova's book.
The new data columns are apparently defined as 1 character long, and the
data become truncated accordingly, much against my intention. I use mysql.

I can see why this is happening. But I'm not sure what the correct solution might be. v.db.join is a wrapper around db.describe and v.db.addcol. When I query an sqlite database connection here's what the output looks like:

db.describe -c bike_rides2
ncols: 8
nrows: 17
Column 1: cat:INTEGER:20
Column 2: name:CHARACTER:80
Column 3: number:INTEGER:20
Column 4: comment:CHARACTER:80
Column 5: descriptio:CHARACTER:80
Column 6: source:CHARACTER:80
Column 7: url:CHARACTER:80

The output format is obviously:
"Column num:column name:column type:column size".

One line from the v.db.join script uses grep and cut to get the column names and the next line gets the column types like so:

db.describe -c bike_rides2 | grep '^Column' | cut -d ':' -f 3
INTEGER
CHARACTER
INTEGER
CHARACTER

So the column size is actually ignored.

Next, in the script the above output is used by v.db.addcol to create the new columns in the joined vector. So all new character columns are created as a single char and the actual length is never used.

Questions:
Is the db.describe output the same for all db drivers?
Any suggestions how to fix this as a script?
Or better, just convert to python?

Thanks,
Micha

--
Micha Silver
Arava Development Co. +972-52-3665918
http://www.surfaces.co.il

-------------------------------------------------------------------
Jon Eiriksson, Research Professor,
Earth Science Institute, University of Iceland,
Askja, Sturlugata 7,
IS-101 Reykjavik, Iceland.

Tel +354 525 44 75
Fax +354 525 44 99
Email jeir@hi.is
Web http://www.raunvis.hi.is/~jeir/
-------------------------------------------------------------------