[GRASS-user] Vector map attributes

Hello

I have a vector map with points representing latrines and each point
has an attribute describing the place name. I ran v.edit to snap
points with a threshold of 5m and v.info reports 7296 features for the
cleaned map. I now wish to count the number of latrines in each place,
so I ran:

echo "SELECT *,count(Area) FROM MT_San_clean GROUP BY Area" | db.select

and get a total of 10537, so the results are bogus.

Running the following command I can see the reason for the bogus results:

v.category in=MT_San_clean opt=print

1331/566/571/691/1332/456/416/410/405/561/390
1294/1249/1265/1229/1225/1273/1233
v.edit has assigned multiple categories to points when they were snapped.

To get a meaningful count of points by Area I can think of two solutions:
1) Use awk to filter out the first category for each feature from the
output of v.category opt=print and feed this list of cats into
db.select,
however I'm not sure of the sql syntax for WHERE "cat=MyListOfCats".
2) Use awk to filter out all, but the first category for each feature
from the output of v.category opt=print and feed this list of cats
into v.category opt=del,
however I'm not sure of the relationship between feature ids and cats,
and v.category wants a list of ids.

Can someone please suggest an easier way or give me a pointer to
implement either of my ideas? As a teaser I can offer to update [1]
with the details :slight_smile:

Many thanks,

Craig

[1] http://grass.osgeo.org/wiki/Count_points_in_polygon

On 25/03/09 11:05, Craig Leat wrote:

Hello

I have a vector map with points representing latrines and each point
has an attribute describing the place name. I ran v.edit to snap
points with a threshold of 5m

Because you think that these points are actually duplicates, or because you want the information about all latrines within a certain radius to be regrouped ?

and v.info reports 7296 features for the
cleaned map. I now wish to count the number of latrines in each place,
so I ran:

echo "SELECT *,count(Area) FROM MT_San_clean GROUP BY Area" | db.select

and get a total of 10537, so the results are bogus.

Running the following command I can see the reason for the bogus results:

v.category in=MT_San_clean opt=print

1331/566/571/691/1332/456/416/410/405/561/390
1294/1249/1265/1229/1225/1273/1233
v.edit has assigned multiple categories to points when they were snapped.

To get a meaningful count of points by Area I can think of two solutions:
1) Use awk to filter out the first category for each feature from the
output of v.category opt=print and feed this list of cats into
db.select,
however I'm not sure of the sql syntax for WHERE "cat=MyListOfCats".

Create a table MyCats containing your cats and the use WHERE "cat in (select * from MyCats)"

2) Use awk to filter out all, but the first category for each feature
from the output of v.category opt=print and feed this list of cats
into v.category opt=del,
however I'm not sure of the relationship between feature ids and cats,
and v.category wants a list of ids.

use v.edit select= cats= to find out ids.

Can someone please suggest an easier way or give me a pointer to
implement either of my ideas?

How about v.edit catdel= cats= ?

Or v.clean tool=rmdupl (although the man page says "pay attention to categories!" - don't know what that means).

Moritz

Moritz Lennert wrote:

On 25/03/09 11:05, Craig Leat wrote:

v.info reports 7296 features for the
cleaned map. I now wish to count the number of latrines in each place,
so I ran:

echo "SELECT *,count(Area) FROM MT_San_clean GROUP BY Area" | db.select

and get a total of 10537, so the results are bogus.

Running the following command I can see the reason for the bogus results:

v.category in=MT_San_clean opt=print

1331/566/571/691/1332/456/416/410/405/561/390
1294/1249/1265/1229/1225/1273/1233
v.edit has assigned multiple categories to points when they were snapped.

For these two examples, maybe you get meaningful results if there is either one point with several categories or several points with one category each, but not both. I'm not sure what v.edit did, but it is possible that you now have several points with identical coordinates and all points have several, identical categories.

Or v.clean tool=rmdupl (although the man page says "pay attention to categories!" - don't know what that means).

I think it means that you don't have control over what is kept and what is deleted. If you have e.g. two points with identical coordinates but different categories assigned, there is no way to influence what point will be deleted, i.e. what category is deleted, thus what associated information in the respective attribute table is no longer accessible because there is no longer a point with the category that was just deleted. Clumsy phrasing, I hope you understand what I mean.

Markus M

Hi

Thanks for the comments.

Moritz:

I have a vector map with points representing latrines and each point
has an attribute describing the place name. I ran v.edit to snap
points with a threshold of 5m

Because you think that these points are actually duplicates, or because
you want the information about all latrines within a certain radius to be
regrouped ?

Some points are exact duplicates (points were downloaded from the GPSr
more than once), others were captured in the field at different times
and although representing the same structure do not have identical
co-ordinates. I snap the points to try and achieve one point per
structure.

Markus M:

... maybe you get meaningful results if there is either
one point with several categories or several points with one category each,
but not both. I'm not sure what v.edit did, but it is possible that you now
have several points with identical coordinates and all points have several,
identical categories.

When I snap points A and B, a new feature C is created and it inherits
the attributes of B and C, even if they are identical. Since I'm
counting rows of attributes (not features) these extra categories
cause problems and need to be removed. v.clean and v.edit don't help
because they remove features not cats.

Moritz:

Or v.clean tool=rmdupl (although the man page says "pay attention to
categories!" - don't know what that means).

Cats are not deleted when a feature is removed (at least for the
rmdupl tool). Could be clarified in the manual, I think.

My solution:
1. Use v.category opt=print and parse out one cat per feature
2. Add a new table (MyCats) to the point vector and populate it with
the list of cats.
3. Instead of:
echo "SELECT *,count(Area) FROM MT_San_clean GROUP BY Area" | db.select
cat|Area|Ref|Lat|Lon|Altitude|Y|X|count(Area)
2177|Gcina|695|-29.6|30.588|729|-39940.752|-3275795.484|2177
3325|Inhlazuka|1237|-30.056|30.439|0|-54111.923|-3326403.99|1148

do this instead (thanks Moritz):
echo "SELECT *,count(Area) FROM MT_San_clean WHERE ""cat in (select *
from MyCats)"" GROUP BY Area" | db.select
cat|Area|Ref|Lat|Lon|Altitude|Y|X|count(Area)
2170|Gcina|688|-29.596|30.589|763.4|-39861.292|-3275438.023|1340
3325|Inhlazuka|1237|-30.056|30.439|0|-54111.923|-3326403.99|616

Gcina down from 2177 to 1340. Wow, fieldwork was really bad!

I'll try to update the FAQ...

Craig