[GRASS-user] Using R to check points in polygons does not work

Dear friends,

I attempted to solve the problem using R, as Hamish suggested.

Because there are some ring direction changes in the shape files, the
conversion to polylist returns "NA" in some fields. When I run inout() on
those list elements the function fails with error NA/NaN/Inf in foreign
function call (arg 4).

Is there another method to test if a point falls in any of the polygons
without using R? or, is there a way to solve this problem?

Best Regards
--
Corrado Topi

Global Climate Change and Biodiversity
Area 18,Department of Biology
University of York, York, YO10 5YW, UK
Phone: + 44 (0) 1904 328645, E-mail: ct529@york.ac.uk

Corrado <ct529 <at> york.ac.uk> writes:

Dear friends,

I attempted to solve the problem using R, as Hamish suggested.

Because there are some ring direction changes in the shape files, the
conversion to polylist returns "NA" in some fields. When I run inout() on
those list elements the function fails with error NA/NaN/Inf in foreign
function call (arg 4).

Is there another method to test if a point falls in any of the polygons
without using R? or, is there a way to solve this problem?

Are you in a hurry? You can certainly do what you need in R, in GRASS, or
very possibly in a combination. But you need to try to break you task
down into smaller pieces and make sure at each step that you know what
is going on. What is your platform (OS)?

I suggest that you import your polygon shapefile into GRASS and clean the
topology, that is first establish an appropriate location, then v.in.ogr
(or v.in.ogr location=), then v.clean (see flags and options), and check
that you have areas.

If the ring direction was changed, I think you were using the read.shape()
internal function in the maptools package, followed by Map2polylist - this
is no longer supported code, use rgdal and readOGR() instead. The polylist
structure is also deprecated. Use SpatialPolygons, into which readOGR()
puts data without fuss.

Indeed, if the ring direction changed, your shapefile cannot have been
clean (this happens, even for shapefiles generated by ArcGIS, and there
are lots of dirty shapefiles in the wild, dirty meaning not conforming
to the ESRI specification). Then the point-in-polygon function inout()
from the splancs package should fail, shouldn't it? By the way, inout()
does not work as such with polylist objects.

Access to the GRASS book (Neteler & Mitasova 2008) will help with GRASS 6
vector operations. Then read in the point locations (are there 58000
observations on 100+ variables, or three columns: (x, y, species)? Are
they also in a shapefile?

If you are familiar with databases (or their view of the world), you can
continue within GRASS with the db.* commands as suggested. If your output
tables will be happiest here, that's fine (Neteler & Mitasova 2008 may
help).

If you are going to use R with GRASS, review the GFOSS 2006 course
materials:

http://www.foss4g2006.org/contributionDisplay.py?
contribId=46&sessionId=59&confId=1

(sorry, gmane didn't like the long line)

(slides and scripts towards the foot of the page).

R will also give you statistical graphics, so that you can visualise your
results if need be. Tallies for conservation areas can also be moved back
to GRASS if need be.

Starting R from the GRASS prompt, say (roughly):

library(spgrass6)
polys <- readVECT6("polys") # correct GRASS vector name
pts <- readVECT6("pts") # ditto
res0 <- overlay(pts, polys)

# yes, it's a one-liner!

This returns a vector as long as the number of points, giving either the
number of the polygon, or NA if the point does not fall into any polygon.

Assuming that pts$species are the species, some variant of:

tapply(is.na(res0), pts$species, table)

will give you: "I need to know for each species, how many point fall
inside conservation areas and how many points fall outside conservation
areas".

tapply(pts$species, res0, table) or some such should tabulate the species
observations by conservation area (but you may need to drop the NAs and the
corresponding rows of pts).

If there are 100 columns of presence/absence, variants of the *apply()
family should deliver the goods, very possibly apply() across the species
columns of pts.

The procedure for buffers could be to use v.buffer in GRASS to create the
buffers, then do the overlaying and tallying in R.

Learning enough of GRASS, or GRASS and R to do this will take a little
time, but undoubtedly less than learning C or Python, with certainly
fewer bugs in the code you use, and with full journals (history from
the shell in GRASS and session history() to save in R -
savehistory("now.Rhistory") is a remarkably helpful tool!

Hope this helps,

Roger

Best Regards

Dear friends,

Platform: Ubuntu 7.10, Grass 6.2.3,PostgresSQL 8.2,R 2.6.2

Yes, I am in a hurry. If I do not solve it by tomorrow, then it is back to
ARCGIS I am afraid. It is not my will, of course, but the project's leader is
already not keen on switching to GRASS, so she is pressing for me to solve it
using ARCGIS. :frowning:

Best,

On Tuesday 04 March 2008 18:33:26 Roger Bivand wrote:

Corrado <ct529 <at> york.ac.uk> writes:
> Dear friends,
>
> I attempted to solve the problem using R, as Hamish suggested.
>
> Because there are some ring direction changes in the shape files, the
> conversion to polylist returns "NA" in some fields. When I run inout() on
> those list elements the function fails with error NA/NaN/Inf in foreign
> function call (arg 4).
>
> Is there another method to test if a point falls in any of the polygons
> without using R? or, is there a way to solve this problem?

Are you in a hurry? You can certainly do what you need in R, in GRASS, or
very possibly in a combination. But you need to try to break you task
down into smaller pieces and make sure at each step that you know what
is going on. What is your platform (OS)?

I suggest that you import your polygon shapefile into GRASS and clean the
topology, that is first establish an appropriate location, then v.in.ogr
(or v.in.ogr location=), then v.clean (see flags and options), and check
that you have areas.

If the ring direction was changed, I think you were using the read.shape()
internal function in the maptools package, followed by Map2polylist - this
is no longer supported code, use rgdal and readOGR() instead. The polylist
structure is also deprecated. Use SpatialPolygons, into which readOGR()
puts data without fuss.

Indeed, if the ring direction changed, your shapefile cannot have been
clean (this happens, even for shapefiles generated by ArcGIS, and there
are lots of dirty shapefiles in the wild, dirty meaning not conforming
to the ESRI specification). Then the point-in-polygon function inout()
from the splancs package should fail, shouldn't it? By the way, inout()
does not work as such with polylist objects.

Access to the GRASS book (Neteler & Mitasova 2008) will help with GRASS 6
vector operations. Then read in the point locations (are there 58000
observations on 100+ variables, or three columns: (x, y, species)? Are
they also in a shapefile?

If you are familiar with databases (or their view of the world), you can
continue within GRASS with the db.* commands as suggested. If your output
tables will be happiest here, that's fine (Neteler & Mitasova 2008 may
help).

If you are going to use R with GRASS, review the GFOSS 2006 course
materials:

http://www.foss4g2006.org/contributionDisplay.py?
contribId=46&sessionId=59&confId=1

(sorry, gmane didn't like the long line)

(slides and scripts towards the foot of the page).

R will also give you statistical graphics, so that you can visualise your
results if need be. Tallies for conservation areas can also be moved back
to GRASS if need be.

Starting R from the GRASS prompt, say (roughly):

library(spgrass6)
polys <- readVECT6("polys") # correct GRASS vector name
pts <- readVECT6("pts") # ditto
res0 <- overlay(pts, polys)

# yes, it's a one-liner!

This returns a vector as long as the number of points, giving either the
number of the polygon, or NA if the point does not fall into any polygon.

Assuming that pts$species are the species, some variant of:

tapply(is.na(res0), pts$species, table)

will give you: "I need to know for each species, how many point fall
inside conservation areas and how many points fall outside conservation
areas".

tapply(pts$species, res0, table) or some such should tabulate the species
observations by conservation area (but you may need to drop the NAs and the
corresponding rows of pts).

If there are 100 columns of presence/absence, variants of the *apply()
family should deliver the goods, very possibly apply() across the species
columns of pts.

The procedure for buffers could be to use v.buffer in GRASS to create the
buffers, then do the overlaying and tallying in R.

Learning enough of GRASS, or GRASS and R to do this will take a little
time, but undoubtedly less than learning C or Python, with certainly
fewer bugs in the code you use, and with full journals (history from
the shell in GRASS and session history() to save in R -
savehistory("now.Rhistory") is a remarkably helpful tool!

Hope this helps,

Roger

> Best Regards

_______________________________________________
grass-user mailing list
grass-user@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/grass-user

Best Regards
--
Corrado Topi

Global Climate Change and Biodiversity
Area 18,Department of Biology
University of York, York, YO10 5YW, UK
Phone: + 44 (0) 1904 328645, E-mail: ct529@york.ac.uk

Corrado <ct529 <at> york.ac.uk> writes:

Dear friends,

Platform: Ubuntu 7.10, Grass 6.2.3,PostgresSQL 8.2,R 2.6.2

Yes, I am in a hurry. If I do not solve it by tomorrow, then it is back to
ARCGIS I am afraid. It is not my will, of course, but the project's leader is
already not keen on switching to GRASS, so she is pressing for me to solve it
using ARCGIS. :frowning:

However it goes, good luck! In any case, please keep notes, because at speed
it's easy to mix up either species, or conservation areas (or both).

I'm pretty certain overlay() on the points and polygons will give you what you
need, first for the straight polygons, then for their buffers.

Roger

On Tue, March 4, 2008 20:07, Corrado wrote:

Dear friends,

Platform: Ubuntu 7.10, Grass 6.2.3,PostgresSQL 8.2,R 2.6.2

Yes, I am in a hurry. If I do not solve it by tomorrow, then it is back to
ARCGIS I am afraid.

Have you tried any of the proposed v.select and/or v.distance solutions ?

Moritz

On Mar 4, 2008, at 1:10 PM, Moritz Lennert wrote:

On Tue, March 4, 2008 20:07, Corrado wrote:

Dear friends,

Platform: Ubuntu 7.10, Grass 6.2.3,PostgresSQL 8.2,R 2.6.2

Yes, I am in a hurry. If I do not solve it by tomorrow, then it is back to
ARCGIS I am afraid.

Have you tried any of the proposed v.select and/or v.distance solutions ?

+1
The recipe I supplied or Moritz's variation from yesterday both work quite efficiently. Assuming you are able to import your points and polygons...

John

Thanks a lot to all of you!

To keep you up to date:

METHOD1
1) v.in.ogr
dsn=/home/ct529/Documents/Projects/UKBryophytes/Datasets/BRYOPHYTES/BRYOPHYTES/SSSI_vectors/ENGLAND.shp
output=England_SSSI layer=ENGLAND min_area=0.0001 snap=-1
location=England_from_shapefile
2) v.clean input=England_SSSI output=England_SSSI_clean
error=England_SSSI_error tool=bpol
3) v.in.ascii
input=/home/ct529/Documents/Projects/UKBryophytes/Datasets/species_sites_100m.csv
output=species_100m format=point fs=, skip=0 x=4 y=5 z=0 cat=0
4) in postgresql: alter table species_100m add colum sssi_name varchar(120);
5) v.distance from=species_100m to=England_SSSI dmax=0 upload=to_attr
to_column=sssi_name col=sssi_name
6) in postgresql: select species_name,sssi_name,count(species_id) from
species_100m where sssi_name is not null group by species_name,sssi_name
order by species_name;

What do you think? I may now run it on dmax=100 and dmax=500.

METHOD2
Today I am going to try Roger's method, combining GRASS and R ,so that I can
compare.

METHOD3
I would like to develop a comparative method using only grass, and reading the
shape file directly (without going through spgrass6) using readOGR and the
rgdal library. Any clue?

I think when I wrote back to the project leader that the problem was solved,
she as rather impressed by GRASS and by the GRASS community.

Best Regards
--
Corrado Topi

Global Climate Change and Biodiversity
Area 18,Department of Biology
University of York, York, YO10 5YW, UK
Phone: + 44 (0) 1904 328645, E-mail: ct529@york.ac.uk

That's great to hear. If there is to be a complaint about open source, it would be that there are many great ways to achieve the same result. Of course, this is actually a compliment.

You will find the environment that you are most comfortable working in and develop your own solutions. If you are comfortable with postgis, you can do all of it there as well. That is not something I have done, but I know it is possible. You might take a look at this post and ferret out the slides from the talk that are available online as well.

http://postgis.refractions.net/pipermail/postgis-users/2007-July/016358.html

Cheers,
John

On Mar 6, 2008, at 2:12 AM, Corrado wrote:

Thanks a lot to all of you!

To keep you up to date:

METHOD1
1) v.in.ogr
dsn=/home/ct529/Documents/Projects/UKBryophytes/Datasets/BRYOPHYTES/BRYOPHYTES/SSSI_vectors/ENGLAND.shp
output=England_SSSI layer=ENGLAND min_area=0.0001 snap=-1
location=England_from_shapefile
2) v.clean input=England_SSSI output=England_SSSI_clean
error=England_SSSI_error tool=bpol
3) v.in.ascii
input=/home/ct529/Documents/Projects/UKBryophytes/Datasets/species_sites_100m.csv
output=species_100m format=point fs=, skip=0 x=4 y=5 z=0 cat=0
4) in postgresql: alter table species_100m add colum sssi_name varchar(120);
5) v.distance from=species_100m to=England_SSSI dmax=0 upload=to_attr
to_column=sssi_name col=sssi_name
6) in postgresql: select species_name,sssi_name,count(species_id) from
species_100m where sssi_name is not null group by species_name,sssi_name
order by species_name;

What do you think? I may now run it on dmax=100 and dmax=500.

METHOD2
Today I am going to try Roger's method, combining GRASS and R ,so that I can
compare.

METHOD3
I would like to develop a comparative method using only grass, and reading the
shape file directly (without going through spgrass6) using readOGR and the
rgdal library. Any clue?

I think when I wrote back to the project leader that the problem was solved,
she as rather impressed by GRASS and by the GRASS community.

Best Regards
--
Corrado Topi

Global Climate Change and Biodiversity
Area 18,Department of Biology
University of York, York, YO10 5YW, UK
Phone: + 44 (0) 1904 328645, E-mail: ct529@york.ac.uk

Dear John,

I have used postgis to load a shapefile into the postgres database, by means
of shp2pgsql.

When I then start grass, how could I can connect to the databse, but I do not
understand how to generate a vector map from the postgis table.

Any help?

Best Regards
--
Corrado Topi

Global Climate Change and Biodiversity
Area 18,Department of Biology
University of York, York, YO10 5YW, UK
Phone: + 44 (0) 1904 328645, E-mail: ct529@york.ac.uk

I have not found a way to work directly with postgis tables. You can use v.in.ogr like this:
v.in.ogr dsn="PG:dbname=your_postgis_db host=yourhost" layer=table_from_postgis out=new_layer

You then have the postgis table available as a layer in grass, but you will not be able to edit it directly from grass to my knowledge. Someone else might be able to supply the v.out.ogr statement that would update the table in postgis.

Cheers,
John

On Mar 6, 2008, at 8:57 AM, Corrado wrote:

Dear John,

I have used postgis to load a shapefile into the postgres database, by means
of shp2pgsql.

When I then start grass, how could I can connect to the databse, but I do not
understand how to generate a vector map from the postgis table.

Any help?

Best Regards
--
Corrado Topi

Global Climate Change and Biodiversity
Area 18,Department of Biology
University of York, York, YO10 5YW, UK
Phone: + 44 (0) 1904 328645, E-mail: ct529@york.ac.uk