[GRASS5] counting number of vector points in same position?

Hi,

I have to transform a vector points map of disease reports.
Often more than one report falls into the same location as
reports were done/geocoded based on cities. This results into
a vector map with often several vector falling into the
same coodinate pair.

I have to transform this map into a new vector map with
single point per coordinate pair and number of cases as
attribute.

How to do that?

I was thinking of v.distance -a to calculate zero-distances.
This requires to exclude that the points "see" themselves.
But then, I am not sure how to transform this into a map.
Maybe with some SQL fun (distinct() operator or so)?

Suggestions welcome. This functionality appears to me to
be needed for GRASS. I darkly remember that it was asked
time ago, but cannot find the mail(s).

Thanks

Markus

On Wed, 10 Aug 2005, Markus Neteler wrote:

Hi,

I have to transform a vector points map of disease reports.
Often more than one report falls into the same location as
reports were done/geocoded based on cities. This results into
a vector map with often several vector falling into the
same coodinate pair.

I have to transform this map into a new vector map with
single point per coordinate pair and number of cases as
attribute.

How to do that?

I was thinking of v.distance -a to calculate zero-distances.
This requires to exclude that the points "see" themselves.
But then, I am not sure how to transform this into a map.
Maybe with some SQL fun (distinct() operator or so)?

Suggestions welcome. This functionality appears to me to
be needed for GRASS. I darkly remember that it was asked
time ago, but cannot find the mail(s).

If you can move the points into R:

xcoord <- runif(300)
ycoord <- runif(300)
reps <- 1+rbinom(300, 3, 0.3)
table(reps)

reps
  1 2 3 4
96 132 66 6

crds <- cbind(rep(xcoord, reps), rep(ycoord, reps))
crds_s <- crds[sample(nrow(crds)),]

replicates your problem, here 582 points for 300 unique locations, in
random order.

u_crds_s <- unique(crds_s)
cnts <- integer(nrow(u_crds_s))
for (i in 1:nrow(u_crds_s)) cnts[i] <- length(which(apply(crds_s, 1,

+ function(x) identical(x, u_crds_s[i,]))))

table(cnts)

cnts
  1 2 3 4
96 132 66 6

res <- cbind(u_crds_s, cnts)

then export res back to GRASS maybe as a shapefile. If the coordinates
have ID tags, matching would be easier, but this works for non-fuzzy
coordinates. If the coordinates need snapping, all.equal() with a
tolerance could be used instead of identical, but it is much slower.

Hope this helps,

Roger

Thanks

Markus

_______________________________________________
grass5 mailing list
grass5@grass.itc.it
http://grass.itc.it/mailman/listinfo/grass5

--
Roger Bivand
Economic Geography Section, Department of Economics, Norwegian School of
Economics and Business Administration, Helleveien 30, N-5045 Bergen,
Norway. voice: +47 55 95 93 55; fax +47 55 95 95 43
e-mail: Roger.Bivand@nhh.no

Hi, Markus.

If your points have the exact same coordinate, you might v.out.ascii then stg
like 'awk -F"|" {print $1 " " $2} | sort -n | uniq -c' on the output file,
then v.in.ascii back. If you have a tolerance radius for "same location", some
rounding tweaking might be needed.

Daniel.

---------- Original Message -----------
From: Markus Neteler <neteler@itc.it>
To: grass5 developers list <grass5@grass.itc.it>
Sent: Wed, 10 Aug 2005 18:35:51 +0200
Subject: [GRASS5] counting number of vector points in same position?

Hi,

I have to transform a vector points map of disease reports.
Often more than one report falls into the same location as
reports were done/geocoded based on cities. This results into
a vector map with often several vector falling into the
same coodinate pair.

I have to transform this map into a new vector map with
single point per coordinate pair and number of cases as
attribute.

How to do that?

I was thinking of v.distance -a to calculate zero-distances.
This requires to exclude that the points "see" themselves.
But then, I am not sure how to transform this into a map.
Maybe with some SQL fun (distinct() operator or so)?

Suggestions welcome. This functionality appears to me to
be needed for GRASS. I darkly remember that it was asked
time ago, but cannot find the mail(s).

Thanks

Markus

_______________________________________________
grass5 mailing list
grass5@grass.itc.it
http://grass.itc.it/mailman/listinfo/grass5

------- End of Original Message -------

If v.neighbors had other statistical functions besides count (e.g., sum,
mean, max, min), you could sum cases for all points within a raster cell.
Then you could transform it back to vector.

Michael

On 8/10/05 9:35 AM, "Markus Neteler" <neteler@itc.it> wrote:

Hi,

I have to transform a vector points map of disease reports.
Often more than one report falls into the same location as
reports were done/geocoded based on cities. This results into
a vector map with often several vector falling into the
same coodinate pair.

I have to transform this map into a new vector map with
single point per coordinate pair and number of cases as
attribute.

How to do that?

I was thinking of v.distance -a to calculate zero-distances.
This requires to exclude that the points "see" themselves.
But then, I am not sure how to transform this into a map.
Maybe with some SQL fun (distinct() operator or so)?

Suggestions welcome. This functionality appears to me to
be needed for GRASS. I darkly remember that it was asked
time ago, but cannot find the mail(s).

Thanks

Markus

On Wed, Aug 10, 2005 at 07:21:12PM +0200, Roger Bivand wrote:

On Wed, 10 Aug 2005, Markus Neteler wrote:

> Hi,
>
> I have to transform a vector points map of disease reports.
> Often more than one report falls into the same location as
> reports were done/geocoded based on cities. This results into
> a vector map with often several vector falling into the
> same coodinate pair.
>
> I have to transform this map into a new vector map with
> single point per coordinate pair and number of cases as
> attribute.
>
> How to do that?
>
> I was thinking of v.distance -a to calculate zero-distances.
> This requires to exclude that the points "see" themselves.
> But then, I am not sure how to transform this into a map.
> Maybe with some SQL fun (distinct() operator or so)?
>
> Suggestions welcome. This functionality appears to me to
> be needed for GRASS. I darkly remember that it was asked
> time ago, but cannot find the mail(s).
>

If you can move the points into R:

> xcoord <- runif(300)
> ycoord <- runif(300)
> reps <- 1+rbinom(300, 3, 0.3)
> table(reps)
reps
  1 2 3 4
96 132 66 6
> crds <- cbind(rep(xcoord, reps), rep(ycoord, reps))
> crds_s <- crds[sample(nrow(crds)),]

replicates your problem, here 582 points for 300 unique locations, in
random order.

> u_crds_s <- unique(crds_s)
> cnts <- integer(nrow(u_crds_s))
> for (i in 1:nrow(u_crds_s)) cnts[i] <- length(which(apply(crds_s, 1,
+ function(x) identical(x, u_crds_s[i,]))))
> table(cnts)
cnts
  1 2 3 4
96 132 66 6
> res <- cbind(u_crds_s, cnts)

then export res back to GRASS maybe as a shapefile. If the coordinates
have ID tags, matching would be easier, but this works for non-fuzzy
coordinates. If the coordinates need snapping, all.equal() with a
tolerance could be used instead of identical, but it is much slower.

Hope this helps,

Thanks, Roger.

I have tried your example successfully. On the (new) destination
machine I still have to install R to perform it there.

Markus

On Wed, Aug 10, 2005 at 01:15:54PM -0500, Daniel Calvelo Aros wrote:

Hi, Markus.

If your points have the exact same coordinate, you might v.out.ascii then stg
like 'awk -F"|" {print $1 " " $2} | sort -n | uniq -c' on the output file,
then v.in.ascii back. If you have a tolerance radius for "same location", some
rounding tweaking might be needed.

Hi Daniel,

like this it is amazingly simple:

v.out.ascii tbe_cases_bite | awk -F"|" '{print $1 " " $2}' | sort -n | uniq -c |\
     awk '{printf "%f|%f|%d\n", $2, $3, $1}' | \
     v.in.ascii out=tbe_cases_bite_numbers cat=0 column="east double, north double, count integer"

Done. Maybe worth a script in GRASS (v.pnt.count or so)? Only an extra test
for 3D points is needed to be added.

BTW: I have fixed v.in.ascii for LatLong vector points where it bailed out
with an error.

Thanks

Markus

On Wed, Aug 10, 2005 at 09:42:56PM -0700, Michael Barton wrote:

If v.neighbors had other statistical functions besides count (e.g., sum,
mean, max, min), you could sum cases for all points within a raster cell.
Then you could transform it back to vector.

Thanks, Michael,

in my case I cannot limit to underlying raster cells.

Inspired by s.cellstats [1] I tried to update it to v.cellstats which is
doing what you mention above (but writing vector).

Do we want v.cellstats? This needs help of someone more
familiar with DBMI programming than me.

Markus

[1] http://grass.itc.it/gdp/html_grass5/html/s.cellstats.html

V.cellstats would be very helpful. As you say, its functionality would be
roughly equivalent to an improved v.neighbors. An improved v.neighbors would
still be nice too, as it does something a little different.

Michael

On 8/11/05 6:56 AM, "Markus Neteler" <neteler@itc.it> wrote:

On Wed, Aug 10, 2005 at 09:42:56PM -0700, Michael Barton wrote:

If v.neighbors had other statistical functions besides count (e.g., sum,
mean, max, min), you could sum cases for all points within a raster cell.
Then you could transform it back to vector.

Thanks, Michael,

in my case I cannot limit to underlying raster cells.

Inspired by s.cellstats [1] I tried to update it to v.cellstats which is
doing what you mention above (but writing vector).

Do we want v.cellstats? This needs help of someone more
familiar with DBMI programming than me.

Markus

[1] http://grass.itc.it/gdp/html_grass5/html/s.cellstats.html

1) v.clean rmdupl will remove duplicates and it should attache
    all cats to one point (-> one-many link between geometry and attributes)
2) v.category layer=2 (-> new layer with 1 cat/point)
3) create and link new table to layer 2
4) v.to.db option=query layer=2 qlayer=1
    (query attributes in layer 1 and uppload to layer 2)

Radim

On 8/10/05, Markus Neteler <neteler@itc.it> wrote:

Hi,

I have to transform a vector points map of disease reports.
Often more than one report falls into the same location as
reports were done/geocoded based on cities. This results into
a vector map with often several vector falling into the
same coodinate pair.

I have to transform this map into a new vector map with
single point per coordinate pair and number of cases as
attribute.

How to do that?

I was thinking of v.distance -a to calculate zero-distances.
This requires to exclude that the points "see" themselves.
But then, I am not sure how to transform this into a map.
Maybe with some SQL fun (distinct() operator or so)?

Suggestions welcome. This functionality appears to me to
be needed for GRASS. I darkly remember that it was asked
time ago, but cannot find the mail(s).

Thanks

Markus

_______________________________________________
grass5 mailing list
grass5@grass.itc.it
http://grass.itc.it/mailman/listinfo/grass5

Thanks Radim. This is helpful. This is also the first example I've seen of
the query option for v.to.db.

Could you explain how it works in a bit more detail.

Thanks.
Michael
______________________________
Michael Barton, Professor of Anthropology
School of Human Evolution and Social Change
Arizona State University
Tempe, AZ 85287-2402
USA

voice: 480-965-6262; fax: 480-965-7671
www: http://www.public.asu.edu/~cmbarton

From: Radim Blazek <radim.blazek@gmail.com>
Date: Tue, 16 Aug 2005 10:36:23 +0200
To: grass5 developers list <grass5@grass.itc.it>
Subject: Re: [GRASS5] counting number of vector points in same position?

1) v.clean rmdupl will remove duplicates and it should attache
    all cats to one point (-> one-many link between geometry and attributes)
2) v.category layer=2 (-> new layer with 1 cat/point)
3) create and link new table to layer 2
4) v.to.db option=query layer=2 qlayer=1
    (query attributes in layer 1 and uppload to layer 2)

Radim

On 8/10/05, Markus Neteler <neteler@itc.it> wrote:

Hi,

I have to transform a vector points map of disease reports.
Often more than one report falls into the same location as
reports were done/geocoded based on cities. This results into
a vector map with often several vector falling into the
same coodinate pair.

I have to transform this map into a new vector map with
single point per coordinate pair and number of cases as
attribute.

How to do that?

I was thinking of v.distance -a to calculate zero-distances.
This requires to exclude that the points "see" themselves.
But then, I am not sure how to transform this into a map.
Maybe with some SQL fun (distinct() operator or so)?

Suggestions welcome. This functionality appears to me to
be needed for GRASS. I darkly remember that it was asked
time ago, but cannot find the mail(s).

Thanks

Markus

_______________________________________________
grass5 mailing list
grass5@grass.itc.it
http://grass.itc.it/mailman/listinfo/grass5

Radim,

thanks for your example. I took the occasion to write the scripts:

  v.db.addtable help
    adds a new attribute table to a given layer of a vector map

and
  v.db.droptable help
    removes existing attribute table of a vector map

Trying to make an example out of below sketch, I finally failed:

# Generate 3 points in the same place (Spearfish60):
echo "599505|4921010|1
599505|4921010|2
599505|4921010|3" | v.in.ascii out=triple

# remove duplicates from map:
v.clean triple out=triple_simple tool=rmdupl
# ... remaining: one2many link between geometry and attributes

# verify categories:
v.category triple_simple op=report
LAYER/TABLE 1/triple_simple:
type count min max
point 3 1 3
line 0 0 0
boundary 0 0 0
centroid 0 0 0
area 0 0 0
all 3 1 3

# add new layer with 1 cat/point:
v.category triple_simple out=triple_simple_multi layer=2

# create and link new table to layer 2:
v.db.addtable triple_simple_multi layer=2 columns="cat integer, count integer"
v.info -c triple_simple_multi layer=2

# upload number of identical vector points to attribute table:
v.to.db triple_simple_multi option=query layer=2 qlayer=1 column=count
Reading data trom the map ... 100%
Querying database ... 100%
DBMI-DBF driver error:
SQL parser error in statement:
SELECT (null) FROM triple_simple_multi WHERE cat = 2 OR cat = 1 OR cat = 3
Error in db_open_select_cursor()

ERROR: Cannot open cursor: 'SELECT (null) FROM triple_simple_multi WHERE
        cat = 2 OR cat = 1 OR cat = 3'

Perhaps you can identify the mistake in the procedure.

Thanks

Markus

On Tue, Aug 16, 2005 at 10:36:23AM +0200, Radim Blazek wrote:

1) v.clean rmdupl will remove duplicates and it should attache
    all cats to one point (-> one-many link between geometry and attributes)
2) v.category layer=2 (-> new layer with 1 cat/point)
3) create and link new table to layer 2
4) v.to.db option=query layer=2 qlayer=1
    (query attributes in layer 1 and uppload to layer 2)

Radim

On 8/10/05, Markus Neteler <neteler@itc.it> wrote:
> Hi,
>
> I have to transform a vector points map of disease reports.
> Often more than one report falls into the same location as
> reports were done/geocoded based on cities. This results into
> a vector map with often several vector falling into the
> same coodinate pair.
>
> I have to transform this map into a new vector map with
> single point per coordinate pair and number of cases as
> attribute.
>
> How to do that?
>
> I was thinking of v.distance -a to calculate zero-distances.
> This requires to exclude that the points "see" themselves.
> But then, I am not sure how to transform this into a map.
> Maybe with some SQL fun (distinct() operator or so)?
>
> Suggestions welcome. This functionality appears to me to
> be needed for GRASS. I darkly remember that it was asked
> time ago, but cannot find the mail(s).
>
> Thanks
>
> Markus
>
> _______________________________________________
> grass5 mailing list
> grass5@grass.itc.it
> http://grass.itc.it/mailman/listinfo/grass5
>

--
Markus Neteler <neteler itc it> http://mpa.itc.it
ITC-irst - Centro per la Ricerca Scientifica e Tecnologica
MPBA - Predictive Models for Biol. & Environ. Data Analysis
Via Sommarive, 18 - 38050 Povo (Trento), Italy

On 8/16/05, Markus Neteler <neteler@itc.it> wrote:

# upload number of identical vector points to attribute table:
v.to.db triple_simple_multi option=query layer=2 qlayer=1 column=count
Reading data trom the map ... 100%
Querying database ... 100%
DBMI-DBF driver error:
SQL parser error in statement:
SELECT (null) FROM triple_simple_multi WHERE cat = 2 OR cat = 1 OR cat = 3
Error in db_open_select_cursor()

ERROR: Cannot open cursor: 'SELECT (null) FROM triple_simple_multi WHERE
        cat = 2 OR cat = 1 OR cat = 3'

qcolumn option is missing, it should be 'count(*) in this case, but it can be
expression supported by database driver, for example sum(colum_in_layer_1).
Unfortunately count and sum is not supported by dbf driver, so it is necessary
to use postgres.

Radim