[GRASS-dev] cleaning large vector files

#494

I checked my swapfile, and under Mac OS there was certainly enough memory
available (4GB RAM and virtually unlimited swap, 64bit address space). This
did not help the clean action to complete successfully. Then, as you
suggested, I decided to recompile the 6.5dev branch (I used OpenSuse 11
Linux 64bit), and gave it a vast 3GB RAM + a 9GB swap partition. When
looking at the output v.clean only tried to allocate a mere 34MB, and the
system had about 9GB available at that time. So, I don't think my RAM
availability is the problem...

GRASS 6.5.svn (nl-rdn):~ > v.clean in=top_top10vec_gebouwen
out=top_top10vec_gebclean tool=bpol
--------------------------------------------------
Tool: Threshold
Break polygons: 0.000000e+00
--------------------------------------------------
Copying vector lines...
Rebuilding parts of topology...
Building topology for vector map <top_top10vec_gebclean>...
Registering primitives...
6041168 primitives registered
25000849 vertices registered
Number of nodes: 6037532
Number of primitives: 6041168
Number of points: 0
Number of lines: 0
Number of boundaries: 3021055
Number of centroids: 3020113
Number of areas: -
Number of isles: -
--------------------------------------------------
Tool: Break polygons
ERROR: G_realloc: unable to allocate 34800040 bytes at break_polygons.c:188
GRASS 6.5.svn (nl-rdn):~ >

With kind regards,

Wouter Boasson (MSc)
Geo-IT Research and Coordination

RIVM - National Institute for Public Health and the Environment
Expertise Centre for Methodology and Information Services

Contact information
-----------------------
RIVM
VenZ/EMI, Pb 86
t.a.v. dhr. Drs. Wouter Boasson
Postbus 1
3720 BA Bilthoven

T +31(0)302748518
M +31(0)611131150
F +31(0)302744456
E wouter.boasson@rivm.nl
mo - th

Wouter Boasson wrote:

#494

I checked my swapfile, and under Mac OS there was certainly enough memory
available (4GB RAM and virtually unlimited swap, 64bit address space). This
did not help the clean action to complete successfully. Then, as you
suggested, I decided to recompile the 6.5dev branch (I used OpenSuse 11
Linux 64bit), and gave it a vast 3GB RAM + a 9GB swap partition. When
looking at the output v.clean only tried to allocate a mere 34MB, and the
system had about 9GB available at that time. So, I don't think my RAM
availability is the problem...
  

Strange. I have recently cleaned a 0.8GB vector, successfully, also using break polygons amongst other cleaning tools. Both RAM and swap were used, no problems. No idea why the reallocation fails. The only idea I have is that swap space is not used, but that doesn't make sense.
Looking at the topology, this is a lot of vertices, boundaries and centroids, but nothing that grass can't handle.
Memory reallocation in break_polygons.c line 188 didn't give me problems before. There were other problems further down when breaking polygons, but these should be solved in 6.5 and 7.

A common pitfall with a new grass installation is that libraries of an older grass version are used (a wild guess):
Is gdal compiled with grass support using an older grass version? If yes, remove grass support from gdal.
Is the path to old grass libraries somewhere in the system path for libraries? If yes, remove that path and run ldconfig on Linux.

As a last resort I can offer you to give it a try myself. I can give you an ftp address where you can upload that beast and then I will see if I can reproduce the error. Or you give me a link where to download that vector. If the data are sensitive, you could remove the attribute table first. I would use grass65 on Linux Fedora 10 64bit.

Regards,

Markus M

GRASS 6.5.svn (nl-rdn):~ > v.clean in=top_top10vec_gebouwen
out=top_top10vec_gebclean tool=bpol
--------------------------------------------------
Tool: Threshold
Break polygons: 0.000000e+00
--------------------------------------------------
Copying vector lines...
Rebuilding parts of topology...
Building topology for vector map <top_top10vec_gebclean>...
Registering primitives...
6041168 primitives registered
25000849 vertices registered
Number of nodes: 6037532
Number of primitives: 6041168
Number of points: 0
Number of lines: 0
Number of boundaries: 3021055
Number of centroids: 3020113
Number of areas: -
Number of isles: -
--------------------------------------------------
Tool: Break polygons
ERROR: G_realloc: unable to allocate 34800040 bytes at break_polygons.c:188
GRASS 6.5.svn (nl-rdn):~ >