[GRASS-user] GADM & Polygon files in general

I sent this originally on 8/5/2010, but it never got posted.

I have been using GRASS for 5 months. I feel pretty good about writing code for processing and manipulating rasters. But I seem to have trouble with vectors. My goal was to import GADM (well-known admin boundary file) into GRASS. I have it as a shapefile. It is around a gigabyte in size. It has taken me nearly 3 months (with the computer grinding away for most of the three months, night and day) in importing a country at a time, then merging them together.

Does anybody already have GADM in GRASS format? Second, is there any way to import polygon files more efficiently or effectively?

Thanks for your help!

Tim


Timothy S. Thomas

Research Fellow, International Food Policy Research Institute (IFPRI)

2033 K St. NW, Washington, DC 20006-1002 (Room 5035)

t.s.thomas@cgiar.org www.ifpri.org

(w) +1-202-862-4605 (f) +1-202-467-4439

skype: timothy.s.thomas

Thomas, Timothy (IFPRI) wrote:

Does anybody already have GADM in GRASS format?

Yes, version 1.

Second, is there any way to
import polygon files more efficiently or effectively?

GRASS 7, Linux 64bit, >4GB RAM -> less than 4 hours for complete
import including cleaning, topologically clean result

Alternatively, only import the polygons of your study area with
options where or spatial or flag -r

Markus M

Thank you for your quick reply!

It leads naturally into a few questions.

1) Do you think whoever has version 1 in GRASS format would be willing
to share?

2) Since I have over 1,000 hours of computer time in converting using
GRASS 6.4 on Linux with 8GB RAM, what in the world happened in GRASS 7
that it can now do it in only 4 hours???

3) Would someone be able to tell me whether the commands I have been
using are the most efficient and correct, or whether that was the cause
of the slow speed in the first place?

For importing a shapefile, I used a command like:

    v.in.ogr dsn=gadm1 layer=usa_gadm1 out=usa_gadm1 -o

And if they do not have great boundaries, a command like

    v.in.ogr dsn=gadm1 layer=usa_gadm1 out=usa_gadm1 snap=0.00002 -o

To combine 2 countries, I would use something like

    v.patch -e in=usa_gadm1,can_gadm1 out=namer_gadm1

And to repair the boundaries, something like

    v.clean namer_gadm1 out=namer_cln_gadm1 tool=snap,break,rmdupl
thresh=0.00002

Thanks again for your help!

   Tim

***************************************************************
Timothy S. Thomas, Research Fellow, IFPRI
t.s.thomas@cgiar.org +1-202-862-4605 skype: timothy.s.thomas
Rm 5035

-----Original Message-----
From: Markus Metz [mailto:markus.metz.giswork@googlemail.com]
Sent: Monday, August 09, 2010 10:11 AM
To: Thomas, Timothy (IFPRI)
Cc: grass-user@lists.osgeo.org; Nelson, Gerald (IFPRI)
Subject: Re: [GRASS-user] GADM & Polygon files in general

Thomas, Timothy (IFPRI) wrote:

Does anybody already have GADM in GRASS format?

Yes, version 1.

Second, is there any way to
import polygon files more efficiently or effectively?

GRASS 7, Linux 64bit, >4GB RAM -> less than 4 hours for complete
import including cleaning, topologically clean result

Alternatively, only import the polygons of your study area with
options where or spatial or flag -r

Markus M

Thomas, Timothy (IFPRI) wrote:

It leads naturally into a few questions.

2) Since I have over 1,000 hours of computer time in converting using
GRASS 6.4 on Linux with 8GB RAM, what in the world happened in GRASS 7
that it can now do it in only 4 hours???

First, I imported the whole thing at once, not each country
separately. Second, some under-the-hood modifications in GRASS 7 of
the cleaning procedures result in reduced memory consumption and
faster import.

3) Would someone be able to tell me whether the commands I have been
using are the most efficient and correct, or whether that was the cause
of the slow speed in the first place?

For importing a shapefile, I used a command like:

v.in.ogr dsn=gadm1 layer=usa_gadm1 out=usa_gadm1 -o

Why the -o flag? GADM shapefiles come with a *.prj file, i.e.
projection information is available and should not be ignored.

And if they do not have great boundaries, a command like

v.in.ogr dsn=gadm1 layer=usa_gadm1 out=usa_gadm1 snap=0.00002 -o

Looks ok, but gadm is AFAICT clean, snapping boundaries is not necessary.

To combine 2 countries, I would use something like

v.patch -e in=usa_gadm1,can_gadm1 out=namer_gadm1

Watch out for category values when patching. I would prefer to import
the whole thing at once, or the county you are interested in.

And to repair the boundaries, something like

v.clean namer_gadm1 out=namer_cln_gadm1 tool=snap,break,rmdupl
thresh=0.00002

Looks ok, but for gadm this should not be necessary.

Markus M

Thank you for your reply! I find your answers and queries extremely helpful.

I tried to import GADM all at once, and it crashed badly. Plus, in version 6.4, if it had run, it would have still taken a long time. Australia alone took 30 hours of computer time.

I used the -o because even though it was in the same projection as my location, there was some complaining by GRASS, so I just chose to override the complaining.

I would not have chosen to use any snap or threshold settings, but GRASS 6.4 refused to import the shapes otherwise.

It may be that I was only able to get the ESRI Geodatabase version, and then I used Arc View to convert it to a shapefile. Perhaps that step caused all of the snap issues.

   Tim

***************************************************************
Timothy S. Thomas, Research Fellow, IFPRI
t.s.thomas@cgiar.org +1-202-862-4605 skype: timothy.s.thomas Rm 5035

-----Original Message-----
From: Markus Metz [mailto:markus.metz.giswork@googlemail.com]
Sent: Monday, August 09, 2010 3:10 PM
To: Thomas, Timothy (IFPRI)
Cc: grass-user@lists.osgeo.org; Nelson, Gerald (IFPRI)
Subject: Re: [GRASS-user] GADM & Polygon files in general

Thomas, Timothy (IFPRI) wrote:

It leads naturally into a few questions.

2) Since I have over 1,000 hours of computer time in converting using
GRASS 6.4 on Linux with 8GB RAM, what in the world happened in GRASS 7
that it can now do it in only 4 hours???

First, I imported the whole thing at once, not each country
separately. Second, some under-the-hood modifications in GRASS 7 of
the cleaning procedures result in reduced memory consumption and
faster import.

3) Would someone be able to tell me whether the commands I have been
using are the most efficient and correct, or whether that was the cause
of the slow speed in the first place?

For importing a shapefile, I used a command like:

v.in.ogr dsn=gadm1 layer=usa_gadm1 out=usa_gadm1 -o

Why the -o flag? GADM shapefiles come with a *.prj file, i.e.
projection information is available and should not be ignored.

And if they do not have great boundaries, a command like

v.in.ogr dsn=gadm1 layer=usa_gadm1 out=usa_gadm1 snap=0.00002 -o

Looks ok, but gadm is AFAICT clean, snapping boundaries is not necessary.

To combine 2 countries, I would use something like

v.patch -e in=usa_gadm1,can_gadm1 out=namer_gadm1

Watch out for category values when patching. I would prefer to import
the whole thing at once, or the county you are interested in.

And to repair the boundaries, something like

v.clean namer_gadm1 out=namer_cln_gadm1 tool=snap,break,rmdupl
thresh=0.00002

Looks ok, but for gadm this should not be necessary.

Markus M

I am grateful for finding the very detailed GADM administrative boundary
files in GRASS format.

I am trying to do a simple thing: I want to create a vector of level 0
of GADM; i.e., country boundaries. In principle, this should just
involve using a command like

v.dissolve input=gadm_v1 output=gadm_v1_lev0 col=ISO

When I give that command, I get a segmentation fault using GRASS 6.4 in
Linux, and it suggests using v.build.

When I use v.build, it tells me the COOR files are larger than they
should be.

Does anyone have any suggestions on how to move forward?

Thanks!

   Tim

***************************************************************
Timothy S. Thomas, Research Fellow, IFPRI
t.s.thomas@cgiar.org +1-202-862-4605 skype: timothy.s.thomas
Rm 5035