[GRASS-user] subsetting vector db based on selected rows

Hi GRASS list,

I'm trying to work with a database and have what I assume is a fairly simple question. I usually work with rasters and am not sure how to update or subset a vector. Specifically, I have the following vector product:

v.info BTMP # output below is trimmed

+----------------------------------------------------------------------------+
| Name: BTMP |
| Timestamp (first layer): none |
|----------------------------------------------------------------------------|
| Map format: native |
|----------------------------------------------------------------------------|
| Type of map: vector (level: 2) |
| |
| Number of points: 0 Number of centroids: 3378 |
| Number of lines: 0 Number of boundaries: 9859 |
| Number of areas: 3946 Number of islands: 433 |
| |
| Map is 3D: No |
| Number of dblinks: 1 |
+----------------------------------------------------------------------------+

With contents like:

db.select table=BTMP | head
cat|area|termini_id
1|5000479.797168|2
2|27502638.884425|1
3|29502830.803293|2
4|14001343.432071|3
5|3000287.878301|4
6|45004318.174513|4
7|4500431.817451|5
8|1500143.93915|5
9|7000671.716036|6

And I'd like to subset this. I'd like the largest area for each group of "termini_id". I can select the subset I want with the following:

db.select sql="select cat,max(area),termini_id from BTMP GROUP BY termini_id" | head

cat|max(area)|termini_id
2|27502638.884425|1
3|29502830.803293|2
4|14001343.432071|3
6|45004318.174513|4
7|4500431.817451|5
9|7000671.716036|6
11|10751031.563912|7
12|4750455.80731|8
15|80007676.75469|9

But that just prints it to the terminal (our I can output to a file). I'd like to either have a new column with 0's for the non-selected and 1s for the selected rows. I can then display the selected rows using d.vect and the WHERE keyword. Or alternatively a new vector which is a subset of this one, based on the selected rows. How do I do this? I tried to generate the new column:

v.db.addcolumn map=BTMP column="max_basin INT"

db.execute sql="update BTMP set max_basin=1 where max(area) GROUP BY termini_id"

But get errors I'm not sure how to interpret:

DBMI-SQLite driver error:
Error in sqlite3_prepare():
near "GROUP": syntax error

DBMI-SQLite driver error:
Error in sqlite3_prepare():
near "GROUP": syntax error

Thanks for any hints to this specific problem or a general tutorial on working with vectors and DBs.

Thanks,

  -k.

The way to do this is with a subquery in the sql statement: Now you need to grab the cat values from the result of the first query, and use them in v.extract :

···

On 02/28/2017 10:25 PM, Ken Mankoff wrote:

Hi GRASS list,

I'm trying to work with a database and have what I assume is a fairly simple question. I usually work with rasters and am not sure how to update or subset a vector. Specifically, I have the following vector product:

v.info BTMP  # output below is trimmed

 +----------------------------------------------------------------------------+
 | Name:            BTMP                                                      |
 | Timestamp (first layer): none                                              |
 |----------------------------------------------------------------------------|
 | Map format:      native                                                    |
 |----------------------------------------------------------------------------|
 |   Type of map: vector (level: 2)                                           |
 |                                                                            |
 |   Number of points:       0               Number of centroids:  3378       |
 |   Number of lines:        0               Number of boundaries: 9859       |
 |   Number of areas:        3946            Number of islands:    433        |
 |                                                                            |
 |   Map is 3D:              No                                               |
 |   Number of dblinks:      1                                                |
 +----------------------------------------------------------------------------+

With contents like:

db.select table=BTMP | head
cat|area|termini_id
1|5000479.797168|2
2|27502638.884425|1
3|29502830.803293|2
4|14001343.432071|3
5|3000287.878301|4
6|45004318.174513|4
7|4500431.817451|5
8|1500143.93915|5
9|7000671.716036|6

And I'd like to subset this. I'd like the largest area for each group of "termini_id". I can select the subset I want with the following:

db.select sql="select cat,max(area),termini_id from BTMP GROUP BY termini_id" | head
db.select sql="select b.cat,b.area,b.termini_id from BTMP AS b WHERE b.area=(SELECT max(area) FROM BTMP WHERE termini_id=b.termini_id) GROUP BY b.termini_id"
cat|max(area)|termini_id
2|27502638.884425|1
3|29502830.803293|2
4|14001343.432071|3
6|45004318.174513|4
7|4500431.817451|5
9|7000671.716036|6
11|10751031.563912|7
12|4750455.80731|8
15|80007676.75469|9

But that just prints it to the terminal (our I can output to a file). I'd like to either have a new column with 0's for the non-selected and 1s for the selected rows. I can then display the selected rows using d.vect and the WHERE keyword. Or alternatively a new vector which is a subset of this one, based on the selected rows. How do I do this? I tried to generate the new column:

v.extract input=BTMP output=BTMP_max cats=2,3,4,6,7,9,11,12,15

v.db.addcolumn map=BTMP column="max_basin INT"

db.execute sql="update BTMP set max_basin=1 where max(area) GROUP BY termini_id" 

But get errors I'm not sure how to interpret:

DBMI-SQLite driver error:
Error in sqlite3_prepare():
near "GROUP": syntax error

DBMI-SQLite driver error:
Error in sqlite3_prepare():
near "GROUP": syntax error

Thanks for any hints to this specific problem or a general tutorial on working with vectors and DBs.

Thanks,

  -k.
_______________________________________________
grass-user mailing list
[grass-user@lists.osgeo.org](mailto:grass-user@lists.osgeo.org)
[https://lists.osgeo.org/mailman/listinfo/grass-user](https://lists.osgeo.org/mailman/listinfo/grass-user)
-- 
Micha Silver
cell: 052-3665918

Dear Micha,

Thank you for the helpful reply. This solves the issue. FYI because there are 1000s of rows, I have combined the two commands like this:

v.extract input=BTMP output=BTMP_max cats=$(db.select -c sql="select b.cat from BTMP AS b WHERE b.area=(SELECT max(area) FROM BTMP WHERE termini_id=b.termini_id) GROUP BY b.termini_id" | tr '\n' ,) --o

  -k.

Well done!

···

On 03/01/2017 02:08 AM, Ken Mankoff wrote:

Dear Micha,

Thank you for the helpful reply. This solves the issue. FYI because there are 1000s of rows, I have combined the two commands like this:

v.extract input=BTMP output=BTMP_max cats=$(db.select -c sql="select b.cat from BTMP AS b WHERE b.area=(SELECT max(area) FROM BTMP WHERE termini_id=b.termini_id) GROUP BY b.termini_id" | tr '\n' ,) --o

  -k.

-- 
Micha Silver
cell: 052-3665918