[GRASSLIST:3641] v.extract (5.7) doesn't work on extra fields?

Here's what I've done so far (ultimately I want to test dissolve TIGER polygons by ZIP code):

- use native GRASS 5.7 geometry, and store attribs in MySQL (so I can do some table cleanup).
- import TIGER CompleteChain, PIP, Polygon layers, with type boundary and centroid.
- now I have 3 fields - the 3rd is extra polygon attributes, where the ZIP codes are
- use SQL copy to create a new table from field 3 with cat and ZCTA5 columns only
- register new table with GRASS vectors as field 4
- extract with dissolve:

   v.extract -d in=tgr out=tgrzip type=area,centroid field=4 list=1-99999

and I get 0 everything output.

I've tried type=boundary,centroid and boundary,area,centroid.
I've tried where="cat > 0" instead of a list.
I've tried without dissolving, just to see if I could get anything.

If I leave out the field I get a copy of all areas/centroids and field 1 (the boundary attribs), which is not quite as expected (for areas I'd expect field 2, the area/centroid attribs), but it's something. And nothing is dissolved of course since all attribs are unique.

So, ignoring the dissolve option for now (just testing), nothing is getting selected when a field other than 1 is used.

Bug? Am I missing something? The imported TIGER vector seem to be OK - v.info reports boundaries, areas and centroids and vector level 2.

-----
William Kyngesburye <kyngchaos@charter.net>
http://webpages.charter.net/kyngchaos/

"Time is an illusion - lunchtime doubly so."

- Ford Prefect

1) It is not guranteed that pairs cat-POLYID in tgr_2 and tgr_3 are
   identical. It seems that in this case it is true but you should not
   rely on that.

2) There are no elements with category of field 4, if you link a table
   to field 4, it has no effect, you could overwrite the link of
   field 3, but that is not secure as I said above.

3) > v.extract -d in=tgr out=tgrzip type=area,centroid field=4 list=1-99999
   This is wrong as it groups areas by category, not by ZIP.

4) There was a bug in v.extract -d, I fixed that in cvs.

5) ZCTA5 is varchar and v.reclass does not work with varchar column.

Here is an example how to do that (Postgres, MySQL will be similar),
I know that it is not simple solution for simple task,
support for varchar columns and dissolve option in v.reclass
should make it easier:

v.in.ogr -o dsn=TGR37013/ layer=CompleteChain,PIP,Polygon output=tgr1 \
            type=boundary,centroid snap=-1

echo "create table tgr1_4 as select distinct ZCTA5 from tgr1_3" | db.execute
echo "create table tgr1_5 as select tgr1_2.cat, tgr1_3.ZCTA5, tgr1_4.oid
      as zcta5id from tgr1_2, tgr1_3, tgr1_4 where tgr1_2.POLYID = tgr1_3.POLYID
      and tgr1_3.ZCTA5 = tgr1_4.ZCTA5" | db.execute

v.db.connect -o map=tgr1 table=tgr1_5 key=cat field=2

v.reclass input=tgr1 output=tgr2 field=2 col=zcta5id type=area

v.extract -d in=tgr2 out=tgr3 type=area list=0-10000000 new=-1

Radim

On Wednesday 09 June 2004 20:08, William K wrote:

Here's what I've done so far (ultimately I want to test dissolve TIGER
polygons by ZIP code):

- use native GRASS 5.7 geometry, and store attribs in MySQL (so I can
do some table cleanup).
- import TIGER CompleteChain, PIP, Polygon layers, with type boundary
and centroid.
- now I have 3 fields - the 3rd is extra polygon attributes, where the
ZIP codes are
- use SQL copy to create a new table from field 3 with cat and ZCTA5
columns only
- register new table with GRASS vectors as field 4
- extract with dissolve:

   v.extract -d in=tgr out=tgrzip type=area,centroid field=4 list=1-99999

and I get 0 everything output.

I've tried type=boundary,centroid and boundary,area,centroid.
I've tried where="cat > 0" instead of a list.
I've tried without dissolving, just to see if I could get anything.

If I leave out the field I get a copy of all areas/centroids and field
1 (the boundary attribs), which is not quite as expected (for areas I'd
expect field 2, the area/centroid attribs), but it's something. And
nothing is dissolved of course since all attribs are unique.

So, ignoring the dissolve option for now (just testing), nothing is
getting selected when a field other than 1 is used.

Bug? Am I missing something? The imported TIGER vector seem to be OK
- v.info reports boundaries, areas and centroids and vector level 2.

-----
William Kyngesburye <kyngchaos@charter.net>
http://webpages.charter.net/kyngchaos/

"Time is an illusion - lunchtime doubly so."

- Ford Prefect

On Jun 15, 2004, at 9:48 AM, Radim Blazek wrote:

1) It is not guranteed that pairs cat-POLYID in tgr_2 and tgr_3 are
   identical. It seems that in this case it is true but you should not
   rely on that.

I guess this is a result of there being no guarantee that TIGER records are sorted by CENPOLYID. There is a 1-1 relationship between fields 2 & 3 by CENPOLYID. Maybe this should be taken care of in the conversion? - sorting TIGER records by an appropriate unique ID (TLID or CENPOLYID).

Maybe I'll do a join by CENPOLYID after conversion, just to get the Polygon attribs into the centroid field.

2) There are no elements with category of field 4, if you link a table
   to field 4, it has no effect, you could overwrite the link of
   field 3, but that is not secure as I said above.

So there are specific associations of field numbers and GRASS geometry? or something like that? like 1=boundaries, 2=centroids/areas? 3 is extra? or what?

I had problems understanding the old vector cat/label thing, I guess I'm still not quite getting the new system. Probably my ArcInfo background getting in the way.

3) > v.extract -d in=tgr out=tgrzip type=area,centroid field=4 list=1-99999
   This is wrong as it groups areas by category, not by ZIP.

so v.extract -d works purely on categories? I guess I was thinking that cats were like a table relate to the fields, and it would use the field columns (minus the cat).

4) There was a bug in v.extract -d, I fixed that in cvs.

I'm using CVS snapshots (2004-6-5 at that time). I had heard about the bug fix and wanted to try it out.

5) ZCTA5 is varchar and v.reclass does not work with varchar column.

Here is an example how to do that (Postgres, MySQL will be similar),
I know that it is not simple solution for simple task,
support for varchar columns and dissolve option in v.reclass
should make it easier:

v.in.ogr -o dsn=TGR37013/ layer=CompleteChain,PIP,Polygon output=tgr1 \
            type=boundary,centroid snap=-1

echo "create table tgr1_4 as select distinct ZCTA5 from tgr1_3" | db.execute

so far, what I've done. oh, wait, distinct.

echo "create table tgr1_5 as select tgr1_2.cat, tgr1_3.ZCTA5, tgr1_4.oid
      as zcta5id from tgr1_2, tgr1_3, tgr1_4 where tgr1_2.POLYID = tgr1_3.POLYID
      and tgr1_3.ZCTA5 = tgr1_4.ZCTA5" | db.execute

now I need to pause to wrap my head around this - kinda slow on SQL selects still....

....so: _2.cat -> (POLYID) -> _3.ZCTA5 -> (ZCTA5) -> _4.oid (giving each unique ZIP an integer)

v.db.connect -o map=tgr1 table=tgr1_5 key=cat field=2

v.reclass input=tgr1 output=tgr2 field=2 col=zcta5id type=area

v.extract -d in=tgr2 out=tgr3 type=area list=0-10000000 new=-1

weeeeeeee! at least I can script all this. so, the cat of tgr3 (dissolved) relates back to the record number of tgr1_5, right? I guess I could rejoin the tables from there to create something exportable with ZCTA5 column...

Thanks, I'll give this a try sometime (kinda busy right now ^_^).

Radim

On Wednesday 09 June 2004 20:08, William K wrote:

Here's what I've done so far (ultimately I want to test dissolve TIGER
polygons by ZIP code):

- use native GRASS 5.7 geometry, and store attribs in MySQL (so I can
do some table cleanup).
- import TIGER CompleteChain, PIP, Polygon layers, with type boundary
and centroid.
- now I have 3 fields - the 3rd is extra polygon attributes, where the
ZIP codes are
- use SQL copy to create a new table from field 3 with cat and ZCTA5
columns only
- register new table with GRASS vectors as field 4
- extract with dissolve:

   v.extract -d in=tgr out=tgrzip type=area,centroid field=4 list=1-99999

and I get 0 everything output.

I've tried type=boundary,centroid and boundary,area,centroid.
I've tried where="cat > 0" instead of a list.
I've tried without dissolving, just to see if I could get anything.

If I leave out the field I get a copy of all areas/centroids and field
1 (the boundary attribs), which is not quite as expected (for areas I'd
expect field 2, the area/centroid attribs), but it's something. And
nothing is dissolved of course since all attribs are unique.

So, ignoring the dissolve option for now (just testing), nothing is
getting selected when a field other than 1 is used.

Bug? Am I missing something? The imported TIGER vector seem to be OK
- v.info reports boundaries, areas and centroids and vector level 2.

-----
William Kyngesburye <kyngchaos@charter.net>
http://webpages.charter.net/kyngchaos/

"Time is an illusion - lunchtime doubly so."

- Ford Prefect

-----
William Kyngesburye <kyngchaos@charter.net>
http://webpages.charter.net/kyngchaos/

"Mon Dieu! but they are all alike. Cheating, murdering, lying, fighting, and all for things that the beasts of the jungle would not deign to possess - money to purchase the effeminate pleasures of weaklings. And yet withal bound down by silly customs that make them slaves to their unhappy lot while firm in the belief that they be the lords of creation enjoying the only real pleasures of existence....

- the wisdom of Tarzan

On Tuesday 15 June 2004 18:23, William K wrote:

On Jun 15, 2004, at 9:48 AM, Radim Blazek wrote:
> 1) It is not guranteed that pairs cat-POLYID in tgr_2 and tgr_3 are
> identical. It seems that in this case it is true but you should not
> rely on that.

I guess this is a result of there being no guarantee that TIGER records
are sorted by CENPOLYID.

Yes.

There is a 1-1 relationship between fields 2
& 3 by CENPOLYID. Maybe this should be taken care of in the
conversion?

Not in v.in.ogr, it knows nothing about the structure of data it imports.

- sorting TIGER records by an appropriate unique ID (TLID
or CENPOLYID).

Maybe I'll do a join by CENPOLYID after conversion, just to get the
Polygon attribs into the centroid field.

> 2) There are no elements with category of field 4, if you link a table
> to field 4, it has no effect, you could overwrite the link of
> field 3, but that is not secure as I said above.

So there are specific associations of field numbers and GRASS geometry?
  or something like that? like 1=boundaries, 2=centroids/areas? 3 is
extra? or what?

Any geometrical element in GRASS may have attached none, one or more
categories. Usualy, a category is used as a link to the database table.
Categories are stored with elements in pairs with field. Field number
usually identifies the table, e.g.
element field category
0 1 123 -> link to table 1
1 2 456 -> link to table 2

Good way to get idea how the field works is to download
http://mpa.itc.it/radim/g51/data.html
display 'multi', query with d.what.vect and read always
field and table. Or you can query your tgr.

You can also think about field as about layer identifier.
Here it is obvious, you have imported 3 layers:
CompleteChain,PIP,Polygon
one field was assigned to each layer.

I had problems understanding the old vector cat/label thing, I guess
I'm still not quite getting the new system. Probably my ArcInfo
background getting in the way.

> 3) > v.extract -d in=tgr out=tgrzip type=area,centroid field=4
> list=1-99999
> This is wrong as it groups areas by category, not by ZIP.

so v.extract -d works purely on categories? I guess I was thinking
that cats were like a table relate to the fields, and it would use the
field columns (minus the cat).

'list' and 'file' works on categories, 'where' works on attributes.

> 4) There was a bug in v.extract -d, I fixed that in cvs.

I'm using CVS snapshots (2004-6-5 at that time). I had heard about the
bug fix and wanted to try it out.

It was fixed today.

> 5) ZCTA5 is varchar and v.reclass does not work with varchar column.
>
> Here is an example how to do that (Postgres, MySQL will be similar),
> I know that it is not simple solution for simple task,
> support for varchar columns and dissolve option in v.reclass
> should make it easier:
>
> v.in.ogr -o dsn=TGR37013/ layer=CompleteChain,PIP,Polygon output=tgr1
> \
> type=boundary,centroid snap=-1
>
> echo "create table tgr1_4 as select distinct ZCTA5 from tgr1_3" |
> db.execute

so far, what I've done. oh, wait, distinct.

> echo "create table tgr1_5 as select tgr1_2.cat, tgr1_3.ZCTA5,
> tgr1_4.oid
> as zcta5id from tgr1_2, tgr1_3, tgr1_4 where tgr1_2.POLYID =
> tgr1_3.POLYID
> and tgr1_3.ZCTA5 = tgr1_4.ZCTA5" | db.execute

now I need to pause to wrap my head around this - kinda slow on SQL
selects still....

....so: _2.cat -> (POLYID) -> _3.ZCTA5 -> (ZCTA5) -> _4.oid (giving
each unique ZIP an integer)

> v.db.connect -o map=tgr1 table=tgr1_5 key=cat field=2
>
> v.reclass input=tgr1 output=tgr2 field=2 col=zcta5id type=area
>
> v.extract -d in=tgr2 out=tgr3 type=area list=0-10000000 new=-1

weeeeeeee! at least I can script all this. so, the cat of tgr3
(dissolved) relates back to the record number of tgr1_5, right?

Yes, to tgr1_5.zcta5id

I guess I could rejoin the tables from there to create something
exportable with ZCTA5 column...

Thanks, I'll give this a try sometime (kinda busy right now ^_^).

Radim

3) > v.extract -d in=tgr out=tgrzip type=area,centroid field=4
list=1-99999
   This is wrong as it groups areas by category, not by ZIP.

so v.extract -d works purely on categories? I guess I was thinking
that cats were like a table relate to the fields, and it would use the
field columns (minus the cat).

'list' and 'file' works on categories, 'where' works on attributes.

so if I used: (assuming the tables are sorted correctly so that _2.cat = _3.cat)

field=3 where "ZCTA != ''"

would it also dissolve by ZCTA? or just select? I tried where once with no luck, but I was selecting on the cat column (where "cat > 0"), so of course there was no difference. Or maybe not because ZCTA5 is varchar? oh, that's probably why you did the whole reclass thing in the first place - dissolve only works on the cat.

oh, but that would be nice for dissolve...

4) There was a bug in v.extract -d, I fixed that in cvs.

I'm using CVS snapshots (2004-6-5 at that time). I had heard about the
bug fix and wanted to try it out.

It was fixed today.

huh, I thought I saw it mentioned a couple weeks ago. maybe you were just talking about fixing it and I jumped the gun.

well, I have stuff to keep me busy, so I'll just wait for the next snapshot.

-----
William Kyngesburye <williamk@mappingspecialists.com>
Mapping Specialists <http://www.mappingspecialists.com>

Don't Panic

On Tuesday 15 June 2004 21:05, William K wrote:

so if I used: (assuming the tables are sorted correctly so that _2.cat
= _3.cat)

field=3 where "ZCTA != ''"

would it also dissolve by ZCTA? or just select? I tried where once
with no luck, but I was selecting on the cat column (where "cat > 0"),
so of course there was no difference. Or maybe not because ZCTA5 is
varchar? oh, that's probably why you did the whole reclass thing in
the first place - dissolve only works on the cat.

oh, but that would be nice for dissolve...

It will not work for 2 reasons:
1) There are no geometrical elements with field 3, because
   the layer 'Polygon' is a layer without geometry. You can check this
   if you run "v.category map=tgr option=report"
   So first you have to relink field 2 to the table tgr_3 :
   v.db.connect -o map=tgr field=2 table=tgr_3 key=cat
2) Dissolving is based on the output categories. If 2 adjacent areas have
   the same output category, the boundary is removed. It does not make sense
   to dissolve a boundary between areas with different output category.
   v.extract field=2 where "ZCTA != ''"
   extracts all areas and if
   'new=-1' original category is kept -> nothing dissolved as cats are unique
   or if
   'new=1' all areas have cat 1 and result is one area.

Radim