[GRASS-user] Connect additional dbf file based on layer 1 attribute column

Dear list,

I am trying to import the European Vector Database (ESDB) version 2 to GRASS -- original data is ArcInfo shapefile. Maybe most of you already know about this freely downloadable database, but for those who don't: http://eusoils.jrc.ec.europa.eu/Library/ESDAC/Online_Data.cfm

Well, the main ESDB shapefile layer name is called sgdbe4_0, which comes in the usual .shp, .prj, .shx, and .dbf extensions. The sgdbe4_0 layer gets imported OK via v.in.ogr, but most of the soil database is left out of the new GRASS vector, because it is included in additional .dbf files with different names, which are therefore not imported into the new GRASS vector.

The sgdbe4_0.dbf table in the GRASS vector has at least one attribute column in common with the other additional .dbf files that don't get imported. Is it possible to connect those .dbf files to the imported GRASS vector based on an attribute column (not the 'cat' column) that is present both in layer 1 of the vector and in those additional (not connected) .dbf files?

Thanks and regards,

Luigi

[GRASS 6.2 - Cygwin]

On 31/03/08 16:04, Luigi Ponti wrote:

Dear list,

I am trying to import the European Vector Database (ESDB) version 2 to GRASS -- original data is ArcInfo shapefile. Maybe most of you already know about this freely downloadable database, but for those who don't: http://eusoils.jrc.ec.europa.eu/Library/ESDAC/Online_Data.cfm

Well, the main ESDB shapefile layer name is called sgdbe4_0, which comes in the usual .shp, .prj, .shx, and .dbf extensions. The sgdbe4_0 layer gets imported OK via v.in.ogr, but most of the soil database is left out of the new GRASS vector, because it is included in additional .dbf files with different names, which are therefore not imported into the new GRASS vector.

The sgdbe4_0.dbf table in the GRASS vector has at least one attribute column in common with the other additional .dbf files that don't get imported. Is it possible to connect those .dbf files to the imported GRASS vector based on an attribute column (not the 'cat' column) that is present both in layer 1 of the vector and in those additional (not connected) .dbf files?

Several options:

1) Use a real database backend (sqlite, postgresql), import the other dbf files with db.in.ogr, then either
  a) link to the map attribute table with v.db.join
  b) create a view combining the cat column from the map attribute with the other table and use v.db.connect to link to that table

2) If the columns linking your map to the other tables are integer, you can also do
  - v.reclasse with 'column' option
  - v.db.connect the result of the reclass to the relevant dbf table

3 More complicated:
  - in your existing map create a new layer of cats with e.g. v.category map=sgdbe4_0 option=add layer=2
  - find the correspondance of the newly created cats with the cats in layer 1 with v.build sgbde4_0 option=cdump
  - use that information to create a new table which contains the layer 2 cat and the key column needed to link to the other dbf tables
  - then proceed as in 1)

We really could do with an enhancement to v.category to allow the creation of a second layer of categories based on a column value of the attributes linked to the first...

Moritz

Dear Moritz: that’s a lot of options – thanks!!!

Moritz Lennert wrote:

On 31/03/08 16:04, Luigi Ponti wrote:

Dear list,

I am trying to import the European Vector Database (ESDB) version 2 to GRASS
[…]
Is it possible to connect those .dbf files to the imported GRASS vector based on an attribute column (not the ‘cat’ column) that is present both in layer 1 of the vector and in those additional (not connected) .dbf files?

Several options:

  1. Use a real database backend (sqlite, postgresql), import the other dbf files with db.in.ogr, then either
    a) link to the map attribute table with v.db.join
    b) create a view combining the cat column from the map attribute with the other table and use v.db.connect to link to that table

I suspected that joins and views were the way to go, because I had been reading philosophical threads some time ago about whether or not grass was supposed to support those. However, I am not a database expert, so thanks for indicating this path that may be a good primer to get more into real database backends for GRASS.

  1. If the columns linking your map to the other tables are integer, you can also do
  • v.reclasse with ‘column’ option
  • v.db.connect the result of the reclass to the relevant dbf table

Yes, the link columns are integers, luckily. I tried this – easy and elegant. It works fine with DBF. Does the GRASS DBF engine has a size limitation for some operations? I got an error while trying the following:

GRASS 6.2.2 (SoilEurope2):~ >d.vect -c sgdbe4_0_reclass display=attr attrcol=EAWC_TOP
DBMI-DBF driver error:
SQL parser error in statement:
select EAWC_TOP from smu_ptrdb.dbf where cat = 330677
Error in db_open_select_cursor()

ERROR: Cannot select attributes:
select EAWC_TOP from smu_ptrdb.dbf where cat = 330677

Note that cat=330677 is 2572th row of the .dbf table (if I change attribute column, I always get an error on the same cat).

3 More complicated:

  • in your existing map create a new layer of cats with e.g. v.category map=sgdbe4_0 option=add layer=2
  • find the correspondance of the newly created cats with the cats in layer 1 with v.build sgbde4_0 option=cdump
  • use that information to create a new table which contains the layer 2 cat and the key column needed to link to the other dbf tables
  • then proceed as in 1)

This one is a bit out of my range: I need to do some homework. Of the three options, this was the hardest for me to understand.

We really could do with an enhancement to v.category to allow the creation of a second layer of categories based on a column value of the attributes linked to the first…

Moritz

Thanks again for a great GRASS-database seminar.

Kind regards,

Luigi

On 01/04/08 00:06, Luigi Ponti wrote:

Dear Moritz: that's a lot of options -- thanks!!!

Moritz Lennert wrote:

On 31/03/08 16:04, Luigi Ponti wrote:

Dear list,

I am trying to import the European Vector Database (ESDB) version 2 to GRASS
[...]
Is it possible to connect those .dbf files to the imported GRASS vector based on an attribute column (not the 'cat' column) that is present both in layer 1 of the vector and in those additional (not connected) .dbf files?

Several options:

1) Use a real database backend (sqlite, postgresql), import the other dbf files with db.in.ogr, then either
    a) link to the map attribute table with v.db.join
    b) create a view combining the cat column from the map attribute with the other table and use v.db.connect to link to that table

I suspected that joins and views were the way to go, because I had been reading philosophical threads some time ago about whether or not grass was supposed to support those. However, I am not a database expert, so thanks for indicating this path that may be a good primer to get more into real database backends for GRASS.

In the long run, this is probably your best bet since it gives you a lot more flexibility. Sqlite is very lightweight and easy to install, so this might be the best beginner's entry...

2) If the columns linking your map to the other tables are integer, you can also do
    - v.reclasse with 'column' option
    - v.db.connect the result of the reclass to the relevant dbf table

Yes, the link columns are integers, luckily. I tried this -- easy and elegant. It works fine with DBF. Does the GRASS DBF engine has a size limitation for some operations?

I am not aware of any.

I got an error while trying the following:

    GRASS 6.2.2 (SoilEurope2):~ >d.vect -c sgdbe4_0_reclass display=attr
    attrcol=EAWC_TOP
    DBMI-DBF driver error:
    SQL parser error in statement:
    select EAWC_TOP from smu_ptrdb.dbf where cat = 330677
    Error in db_open_select_cursor()

    ERROR: Cannot select attributes:
           select EAWC_TOP from smu_ptrdb.dbf where cat = 330677

Note that cat=330677 is 2572th row of the .dbf table (if I change attribute column, I always get an error on the same cat).

What does v.db.select where=cat=330677 give you ?

Depending on the size of you dbf table you can also open it in OOCalc or OOBase to have a look at the row.

3 More complicated:
    - in your existing map create a new layer of cats with e.g. v.category map=sgdbe4_0 option=add layer=2
    - find the correspondance of the newly created cats with the cats in layer 1 with v.build sgbde4_0 option=cdump
    - use that information to create a new table which contains the layer 2 cat and the key column needed to link to the other dbf tables
    - then proceed as in 1)

This one is a bit out of my range: I need to do some homework. Of the three options, this was the hardest for me to understand.

Yes, layers are complicated, and tools to use them still a bit clunky. The advantage of layers is that you do not have to duplicate your map as you do with v.reclass, nor v.db.connect back and forth between different tables as you do when working in the database backend... although I myself normally use this latter solution.
One thing to watch out for: a table which is v.db.connect'ed to a map is dropped (deleted) when you delete the map !

We really could do with an enhancement to v.category to allow the creation of a second layer of categories based on a column value of the attributes linked to the first...

Moritz

Thanks again for a great GRASS-database seminar.

A few links you might want to read through:

http://grass.itc.it/grass63/manuals/html63_user/sql.html
http://grass.itc.it/grass63/manuals/html63_user/databaseintro.html
http://grass.itc.it/grass63/manuals/html63_user/vectorintro.html
http://grass.gdf-hannover.de/wiki/Vector_Database_Management
http://grass.gdf-hannover.de/wiki/Openoffice.org_with_SQL_Databases

Moritz

Luigi:

Does the GRASS DBF engine has a size limitation for some operations?
I got an error while trying the following:

    GRASS 6.2.2 (SoilEurope2):~ >d.vect -c sgdbe4_0_reclass
display=attr
    attrcol=EAWC_TOP
    DBMI-DBF driver error:
    SQL parser error in statement:
    select EAWC_TOP from smu_ptrdb.dbf where cat = 330677
    Error in db_open_select_cursor()

    ERROR: Cannot select attributes:
           select EAWC_TOP from smu_ptrdb.dbf where cat = 330677

Note that cat=330677 is 2572th row of the .dbf table (if I change
attribute column, I always get an error on the same cat).

select EAWC_TOP from smu_ptrdb where cat = 330677
not
select EAWC_TOP from smu_ptrdb.dbf where cat = 330677

Hamish

____________________________________________________________________________________
You rock. That's why Blockbuster's offering you one month of Blockbuster
Total Access, No Cost.
http://tc.deals.yahoo.com/tc/blockbuster/text5.com

      ____________________________________________________________________________________
Special deal for Yahoo! users & friends - No Cost. Get a month of Blockbuster Total Access now
http://tc.deals.yahoo.com/tc/blockbuster/text3.com

Hamish wrote:

Luigi:
  
Does the GRASS DBF engine has a size limitation for some operations?
I got an error while trying the following:

    GRASS 6.2.2 (SoilEurope2):~ >d.vect -c sgdbe4_0_reclass
display=attr
    attrcol=EAWC_TOP
    DBMI-DBF driver error:
    SQL parser error in statement:
    select EAWC_TOP from smu_ptrdb.dbf where cat = 330677
    Error in db_open_select_cursor()

    ERROR: Cannot select attributes:
           select EAWC_TOP from smu_ptrdb.dbf where cat = 330677

Note that cat=330677 is 2572th row of the .dbf table (if I change 
attribute column, I always get an error on the same cat).
    

select EAWC_TOP from smu_ptrdb where cat = 330677
 not
select EAWC_TOP from smu_ptrdb.dbf where cat = 330677
  

Yes, my fault: I had tried to v.db.connect using a table name with .dbf extension (i.e., smu_ptrdb.dbf). When using smu_ptrdb instead, all went well. Apparently the DBF driver assumes the table has a .dbf extension.
Thanks and regards,

Luigi