[GRASS-user] Join sqlite table by coordinate pair (X,Y)

Hi,

I have a points vector file with the columns X and Y which are
populated with the respective coordinates.
Additionally I have a .csv file for the same points, also with the X
and Y column and a lot more columns
providing additional information. These two files don't share a single
unique key which can be used for joining.

Thus I'd like to use the X-Y pair as a key for importing and then
joining the csv table
to the vector. Is there any possibility to use two columns as a
combined key instead of one
(probably an SQL statement for db.execute)? Or do I need to first
create a new column
and combine x and y into a key for the vector and the csv?

Any suggestions how that can be done in GRASS (6.5, sqlite) in a simple way?

Best regards,
Johannes

Hi,

I have a points vector file with the columns X and Y which are
populated with the respective coordinates.
Additionally I have a .csv file for the same points, also with the X
and Y column and a lot more columns
providing additional information. These two files don't share a single
unique key which can be used for joining.

Thus I'd like to use the X-Y pair as a key for importing and then
joining the csv table
to the vector. Is there any possibility to use two columns as a
combined key instead of one
(probably an SQL statement for db.execute)? Or do I need to first
create a new column
and combine x and y into a key for the vector and the csv?

If the X-Y values are exactly the same then you probably could do it directly in sqlite:
First import the csv table into your sqlite.db

.import file.csv csv_table
.schema csv_table

Now create columns in the vector table for each new column from the csv table

ALTER TABLE vector ADD COLUMN from_csv_col1 text;
ALTER TABLE vector ADD COLUMN from_csv_col2 integer;
…etc…

Now update the values in each new column with the matching values from the csv table:

UPDATE vector SET from_csv_col1=(SELECT c.col1 FROM csv_table AS c
… WHERE c.x_coord=vector.x_coord AND c.y_coord=vector.y_coord);

and so on for all the new columns.
HTH

On 22/10/12 15:40, Micha Silver wrote:

  On 10/22/2012 01:20 PM, Johannes Radinger wrote:

Hi,

I have a points vector file with the columns X and Y which are
populated with the respective coordinates.
Additionally I have a .csv file for the same points, also with the X
and Y column and a lot more columns
providing additional information. These two files don't share a single
unique key which can be used for joining.

Thus I'd like to use the X-Y pair as a key for importing and then
joining the csv table
to the vector. Is there any possibility to use two columns as a
combined key instead of one
(probably an SQL statement for db.execute)? Or do I need to first
create a new column
and combine x and y into a key for the vector and the csv?

If the X-Y values are *exactly* the same then you probably could do it
directly in sqlite:
First import the csv table into your sqlite.db
sqlite>.import file.csv csv_table
sqlite>.schema csv_table

Now create columns in the vector table for each new column from the csv
table
sqlite> ALTER TABLE vector ADD COLUMN from_csv_col1 text;
sqlite> ALTER TABLE vector ADD COLUMN from_csv_col2 integer;
...etc...

Now update the values in each new column with the matching values from
the csv table:
sqlite> UPDATE vector SET from_csv_col1=(SELECT c.col1 FROM csv_table AS c
... WHERE c.x_coord=vector.x_coord AND c.y_coord=vector.y_coord);

and so on for all the new columns.

Or (if you're lazy like me) just create a new table resulting from the join of the two:

CREATE TABLE joined-table AS SELECT t1.*, t2.* FROM existing_table t1 JOIN csv_table t2 ON (t1.x=t2.x AND t1.y=t2.y)

and then link the new table to your vector map.

If that doesn't work (i.e. as Micha hints x and y have to be exactly identical and I'm not even sure all db backends accept identities of double precision values - example from the PostgreSQL docs: "Comparing two floating-point values for equality might not always work as expected.") you can also import your csv file as a vector map with v.in.ascii and then use v.distance to copy the cat values from one to the other so as to have a common key.

Moritz

HTH

Any suggestions how that can be done in GRASS (6.5, sqlite) in a simple way?

Best regards,
Johannes
_______________________________________________
grass-user mailing list
grass-user@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/grass-user

This mail was received via Mail-SeCure System.

_______________________________________________
grass-user mailing list
grass-user@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/grass-user

Hi,

thank you for your enspiring answers. I think I'll stick to the direct
sqlite approach
instead of the v.distance.
There are now two tables in my sqlite database, one is connected to
the GRASS vector
and the other is a table imported from a csv. For both tables I want
to join/merge I created a new column .
This new column contains a text-key (like "123.1232-3424.234"). These
text strings are identical in both
tables that's why I could use them as key.

As both tables have similarly approx. 130 000 rows I am not sure about
the fastest way
to join/merge/append 3 columns from on table to the GRASS vector table
using sqlite?

I tried your approach which took very long time (I stopped the process
after 45 min), thus
I interesting if there is a faster approach than:

sqlite> UPDATE vector SET from_csv_col1=(SELECT c.col1 FROM csv_table AS c WHERE c.xy_key=vector.xy_key);

Is there any way to speed up such a join? As I am newbie to SQL/SQlite, any help
is much appreciated! Maybe this is more a question for a pure
Sqlite/SQL forum...

/johannes

The first thing that comes to mind is to create an index on both of the text-key columns CREATE INDEX idx_table1_key ON table1(key); CREATE INDEX idx_table2_key ON table2(key); What happened to equating the two X and Y coordinates? Also did you try Moritz’s suggestion to create a new table as a JOIN between the two originals?

···

On 25/10/2012 13:00, Johannes Radinger wrote:

Hi,

thank you for your enspiring answers. I think I'll stick to the direct
sqlite approach
instead of the v.distance.
There are now two tables in my sqlite database, one is connected to
the GRASS vector
 and the other is a table imported from a csv. For both tables I want
to join/merge I created a new column .
This new column contains a text-key (like "123.1232-3424.234"). These
text strings are identical in both
tables that's why I could use them as key.

As both tables have similarly approx. 130 000 rows I am not sure about
the fastest way
to join/merge/append 3 columns from on table to the GRASS vector table
using sqlite?

I tried your approach which took very long time (I stopped the process
after 45 min), thus
I interesting if there is a faster approach than:

sqlite> UPDATE vector SET from_csv_col1=(SELECT c.col1 FROM csv_table AS c WHERE c.xy_key=vector.xy_key);

Is there any way to speed up such a join? As I am newbie to SQL/SQlite, any help
is much appreciated! Maybe this is more a question for a pure
Sqlite/SQL forum...

/johannes

This mail was received via Mail-SeCure System.

-- 
Micha Silver
052-3665918

Dear all,
Dear Micha,

The first thing that comes to mind is to create an index on both of the
text-key columns
CREATE INDEX idx_table1_key ON table1(key);
CREATE INDEX idx_table2_key ON table2(key);

Yours suggestion of creating an index substantially improved the speed
of the Sqlite process (From 45 min to <1 min).

Thank you very much.

Best regards,
Johannes