[GRASS-user] v.clean process killed itselt!?

Attempting to clean a very big vector map (after patching and before
dissolving) ended without success. It's about all CORINE tiles to form
the European-wide land cover map.

The process was running for 2 days now and killed itself for reasons I
don't understand. The machine I currently work with has 4GB of RAM and
8GB of swap memory, not to mention the free hard disk space.

# the map is big!
GRASS 6.4.svn (corine):/geo/grassdb/europe/corine/PERMANENT/vector > ls
-lah corine

total 2.7G
drwxr-xr-x 2 nik nik 72 2009-01-06 09:59 .
drwxr-xr-x 715 nik nik 24K 2009-01-06 08:56 ..
-rw-r--r-- 1 nik nik 109M 2009-01-06 10:01 cidx
-rw-r--r-- 1 nik nik 1.9G 2009-01-06 10:01 coor
-rw-r--r-- 1 nik nik 58 2009-01-06 10:01 dbln
-rw-r--r-- 1 nik nik 191 2009-01-06 10:01 head
-rw-r--r-- 1 nik nik 12K 2009-01-06 08:56 hist
-rw-r--r-- 1 nik nik 647M 2009-01-06 09:59 topo

# cleaning...
GRASS 6.4.svn (corine):~ > v.clean corine out=corine_clean
tool=snap,break,rmdupl thresh=.01

WARNING: 'vector/corine' was found in more mapsets (also found in
         <PERMANENT>)
WARNING: Using <corine@nik>
--------------------------------------------------
Tool: Threshold
Snap vertices: 1.000000e-02
Break: 0.000000e+00
Remove duplicates: 0.000000e+00
--------------------------------------------------
WARNING: 'vector/corine' was found in more mapsets (also found in
         <PERMANENT>)
WARNING: Using <corine@nik>
Copying vector lines...
WARNING: Table <corine_clean> already exists in database
         </geo/grassdb/europe/corine/nik/sqlite.db>
WARNING: Unable to copy table <corine_clean>
WARNING: Failed to copy attribute table to output map
Rebuilding parts of topology...
Building topology for vector map <corine_clean>...
Registering primitives...
7491252 primitives registered
122361691 vertices registered
Number of nodes: 5740813
Number of primitives: 7491252
Number of points: 0
Number of lines: 0
Number of boundaries: 5526642
Number of centroids: 1964610
Number of areas: -
Number of isles: -
--------------------------------------------------
Tool: Snap line to vertex in threshold
w
Killed
GRASS 6.4.svn (corine):~ > w
00:02:58 up 2 days, 1:55, 2 users, load average: 16.82, 10.29, 5.17
USER TTY FROM LOGIN@ IDLE JCPU PCPU WHAT
nik tty7 :0 Tue22 2days 1:43
0.64s /usr/bin/gnome-
nik pts/0 :0.0 Tue22 27.00s 0.16s 0.00s w
GRASS 6.4.svn (corine):~ >

-The "w" was pressed by me accidentally. But I assume that it has little
to do with the process being killed.
-Additionally, I was watching from time to time the process's status via
"top".

What on earth killed this "important" process? How should on go about
and clean this map?

Thanks, Nikos

On Fri, Jan 9, 2009 at 6:27 AM, Nikos Alexandris
<nikos.alexandris@felis.uni-freiburg.de> wrote:

Attempting to clean a very big vector map (after patching and before
dissolving) ended without success. It's about all CORINE tiles to form
the European-wide land cover map.

The process was running for 2 days now and killed itself for reasons I
don't understand. The machine I currently work with has 4GB of RAM and
8GB of swap memory, not to mention the free hard disk space.

# the map is big!

I see two potential reasons:
- large file support needed (--enable-largefile, did you use it?)
- looking at the number of vertices, perhaps the process runs
  out of memory (check 'dmesg' if you are on Linux). In this case
  adding more swap space may suffice.

GRASS 6.4.svn (corine):/geo/grassdb/europe/corine/PERMANENT/vector > ls
-lah corine

total 2.7G
drwxr-xr-x 2 nik nik 72 2009-01-06 09:59 .
drwxr-xr-x 715 nik nik 24K 2009-01-06 08:56 ..
-rw-r--r-- 1 nik nik 109M 2009-01-06 10:01 cidx
-rw-r--r-- 1 nik nik 1.9G 2009-01-06 10:01 coor
-rw-r--r-- 1 nik nik 58 2009-01-06 10:01 dbln
-rw-r--r-- 1 nik nik 191 2009-01-06 10:01 head
-rw-r--r-- 1 nik nik 12K 2009-01-06 08:56 hist
-rw-r--r-- 1 nik nik 647M 2009-01-06 09:59 topo

# cleaning...
GRASS 6.4.svn (corine):~ > v.clean corine out=corine_clean
tool=snap,break,rmdupl thresh=.01

WARNING: 'vector/corine' was found in more mapsets (also found in
        <PERMANENT>)
WARNING: Using <corine@nik>
--------------------------------------------------
Tool: Threshold
Snap vertices: 1.000000e-02
Break: 0.000000e+00
Remove duplicates: 0.000000e+00
--------------------------------------------------
WARNING: 'vector/corine' was found in more mapsets (also found in
        <PERMANENT>)
WARNING: Using <corine@nik>
Copying vector lines...
WARNING: Table <corine_clean> already exists in database
        </geo/grassdb/europe/corine/nik/sqlite.db>
WARNING: Unable to copy table <corine_clean>
WARNING: Failed to copy attribute table to output map
Rebuilding parts of topology...
Building topology for vector map <corine_clean>...
Registering primitives...
7491252 primitives registered
122361691 vertices registered
Number of nodes: 5740813
Number of primitives: 7491252
Number of points: 0
Number of lines: 0
Number of boundaries: 5526642
Number of centroids: 1964610
Number of areas: -
Number of isles: -
--------------------------------------------------
Tool: Snap line to vertex in threshold
w
Killed

...

-Additionally, I was watching from time to time the process's status via
"top".

how much RAM/SWAP was used?

Markus

Hi Markus.
Below more details upon the subject.

On Fri, 2009-01-09 at 09:36 +0100, Markus Neteler wrote:

On Fri, Jan 9, 2009 at 6:27 AM, Nikos Alexandris
<nikos.alexandris@felis.uni-freiburg.de> wrote:
> Attempting to clean a very big vector map (after patching and before
> dissolving) ended without success. It's about all CORINE tiles to form
> the European-wide land cover map.
>
> The process was running for 2 days now and killed itself for reasons I
> don't understand. The machine I currently work with has 4GB of RAM and
> 8GB of swap memory, not to mention the free hard disk space.
>
> # the map is big!

I see two potential reasons:
- large file support needed (--enable-largefile, did you use it?)
- looking at the number of vertices, perhaps the process runs
  out of memory (check 'dmesg' if you are on Linux). In this case
  adding more swap space may suffice.

- Large file support is there. With the configuration I use to compile
the source code [1] LFS is reports "Yes" [2].

- So, it is a good practice to expand the swap space then (even more
than 8GB)? How much? Any method to practically estimate the potentially
needed space based on a big map's size?

  I believe that the relevant message confirms your skepsis Markus. See
[3]

[...]

> -Additionally, I was watching from time to time the process's status via
> "top".

how much RAM/SWAP was used?

I have to admit that I was watching the %MEM column and not the VIRT
(which is I suppose the swap space). The %MEM, as long as I was keeping
an eye on it was between 25% and 39%, and at some point it quickly
reached 82% - 83%.

Anyhow, is it the only way to go about it? I repeat that all I want is
really to have a clean vector before dissolving.

Markus

Kind regards, Nikos

=============================================================================
[1] CFLAGS="-g -Wall" LDFLAGS="-s" ./configure \
     --enable-64bit \
     --with-libs=/usr/lib64 \
     --with-cxx \
     --with-freetype=yes \
     --with-freetype-includes="/usr/include/freetype2/" \
     --with-postgres=no \
     --with-sqlite=yes \
     --enable-largefile=yes \
     --with-tcltk-includes="/usr/include/tcl8.4/" \
     --with-freetype-includes=/usr/include/freetype2 \
     --with-opengl-libs=/usr/include/GL \
     --with-readline \
     --with-python=yes \
     --with-proj-share=/usr/local/share/proj/ \
     --with-wxwidgets \
     --with-cairo \
     --with-ffmpeg=yes --with-ffmpeg-includes="/usr/include/ffmpeg/"
-----------------------------------------------------------------------------
[2] GRASS is now configured for: x86_64-unknown-linux-gnu

  Source directory: /usr/local/src/grass6_devel
  Build directory: /usr/local/src/grass6_devel
  Installation directory: ${prefix}/grass-6.4.svn
  Startup script in directory: ${exec_prefix}/bin
  C compiler: gcc -g -Wall
  C++ compiler: c++ -g -O2
  Building shared libraries: yes
  64bit support: yes
  OpenGL platform: X11
  MacOSX application: no

  NVIZ: yes

  BLAS support: no
  C++ support: yes
  Cairo support: yes
  DWG support: no
  FFMPEG support: yes
  FFTW support: yes
  FreeType support: yes
  GDAL support: yes
  GLw support: no
  JPEG support: yes
  LAPACK support: no
  Large File support (LFS): yes
  Motif support: no
  MySQL support: no
  NLS support: no
  ODBC support: no
  OGR support: yes
  OpenGL support: yes
  PNG support: yes
  PostgreSQL support: no
  Python support: yes
  Readline support: yes
  SQLite support: yes
  Tcl/Tk support: yes
  wxWidgets support: yes
  TIFF support: yes
  X11 support: yes
-------------------------------------------------------------------------
[3] Taken from /var/log/messages

Jan 8 23:47:46 vertical -- MARK --
Jan 9 00:02:40 vertical kernel: [179692.069123] type=1503
audit(1231455736.104:5): operation="capable" name="sys_admin" pid=6911
profile="/usr/sbin/cupsd"
Jan 9 00:02:40 vertical kernel: [179692.069135] type=1503
audit(1231455736.104:6): operation="capable" name="sys_resource"
pid=6911 profile="/usr/sbin/cupsd"
Jan 9 00:02:40 vertical kernel: [179692.069141] type=1503
audit(1231455736.104:7): operation="capable" name="sys_rawio" pid=6911
profile="/usr/sbin/cupsd"
Jan 9 00:02:40 vertical kernel: [179692.069294] main-menu invoked
oom-killer: gfp_mask=0x1201d2, order=0, oomkilladj=0
Jan 9 00:02:40 vertical kernel: [179692.069299] Pid: 6911, comm:
main-menu Tainted: P 2.6.27-9-generic #1
Jan 9 00:02:40 vertical kernel: [179692.069301]
Jan 9 00:02:40 vertical kernel: [179692.069302] Call Trace:
Jan 9 00:02:40 vertical kernel: [179692.069312] [<ffffffff802af87a>]
oom_kill_process+0x9a/0x230
Jan 9 00:02:40 vertical kernel: [179692.069316] [<ffffffff802afd5f>] ?
select_bad_process+0xef/0x130
Jan 9 00:02:40 vertical kernel: [179692.069319] [<ffffffff802aff35>]
out_of_memory+0x195/0x270
Jan 9 00:02:40 vertical kernel: [179692.069323] [<ffffffff802b2d09>]
__alloc_pages_internal+0x4d9/0x520
Jan 9 00:02:40 vertical kernel: [179692.069328] [<ffffffff802d58cd>]
alloc_pages_current+0xad/0x110
Jan 9 00:02:40 vertical kernel: [179692.069334] [<ffffffff802ac617>]
__page_cache_alloc+0x67/0x80
Jan 9 00:02:40 vertical kernel: [179692.069338] [<ffffffff802b646c>]
__do_page_cache_readahead+0xec/0x220
Jan 9 00:02:40 vertical kernel: [179692.069341] [<ffffffff802ac710>] ?
sync_page+0x0/0x70
Jan 9 00:02:40 vertical kernel: [179692.069346] [<ffffffff80267090>] ?
wake_bit_function+0x0/0x50
Jan 9 00:02:40 vertical kernel: [179692.069349] [<ffffffff802b6603>]
do_page_cache_readahead+0x63/0x90
Jan 9 00:02:40 vertical kernel: [179692.069353] [<ffffffff802add4a>]
filemap_fault+0x34a/0x430
Jan 9 00:02:40 vertical kernel: [179692.069357] [<ffffffff802c2174>]
__do_fault+0x64/0x440
Jan 9 00:02:40 vertical kernel: [179692.069360] [<ffffffff802c310e>]
handle_mm_fault+0x1ee/0x470
Jan 9 00:02:40 vertical kernel: [179692.069366] [<ffffffff805053cf>]
do_page_fault+0x34f/0x750
Jan 9 00:02:40 vertical kernel: [179692.069370] [<ffffffff802cda0e>] ?
free_pages_and_swap_cache+0x8e/0xb0
Jan 9 00:02:40 vertical kernel: [179692.069374] [<ffffffff80234059>] ?
__phys_addr+0x9/0x50
Jan 9 00:02:40 vertical kernel: [179692.069380] [<ffffffff803a7c78>] ?
__up_write+0x68/0x140
Jan 9 00:02:40 vertical kernel: [179692.069384] [<ffffffff80502a7a>]
error_exit+0x0/0x70
Jan 9 00:02:40 vertical kernel: [179692.069386]
Jan 9 00:02:40 vertical kernel: [179692.069387] Mem-Info:
Jan 9 00:02:40 vertical kernel: [179692.069389] Node 0 DMA per-cpu:
Jan 9 00:02:40 vertical kernel: [179692.069392] CPU 0: hi: 0,
btch: 1 usd: 0
Jan 9 00:02:40 vertical kernel: [179692.069394] CPU 1: hi: 0,
btch: 1 usd: 0
Jan 9 00:02:40 vertical kernel: [179692.069396] Node 0 DMA32 per-cpu:
Jan 9 00:02:40 vertical kernel: [179692.069399] CPU 0: hi: 186,
btch: 31 usd: 174
Jan 9 00:02:40 vertical kernel: [179692.069401] CPU 1: hi: 186,
btch: 31 usd: 185
Jan 9 00:02:40 vertical kernel: [179692.069403] Node 0 Normal per-cpu:
Jan 9 00:02:40 vertical kernel: [179692.069405] CPU 0: hi: 186,
btch: 31 usd: 172
Jan 9 00:02:40 vertical kernel: [179692.069407] CPU 1: hi: 186,
btch: 31 usd: 175
Jan 9 00:02:40 vertical kernel: [179692.069411] Active:583277
inactive:376079 dirty:0 writeback:0 unstable:0
Jan 9 00:02:40 vertical kernel: [179692.069412] free:5383 slab:17065
mapped:182 pagetables:10765 bounce:0
Jan 9 00:02:40 vertical kernel: [179692.069415] Node 0 DMA free:9576kB
min:16kB low:20kB high:24kB active:0kB inactive:0kB present:8448kB
pages_scanned:0 all_unreclaimable? yes
Jan 9 00:02:40 vertical kernel: [179692.069419] lowmem_reserve: 0
2968 3976 3976
Jan 9 00:02:40 vertical kernel: [179692.069424] Node 0 DMA32
free:9980kB min:6016kB low:7520kB high:9024kB active:1868404kB
inactive:1040896kB present:3039764kB pages_scanned:6881497
all_unreclaimable? yes
Jan 9 00:02:40 vertical kernel: [179692.069429] lowmem_reserve: 0 0
1008 1008
Jan 9 00:02:40 vertical kernel: [179692.069433] Node 0 Normal
free:1976kB min:2040kB low:2548kB high:3060kB active:464704kB
inactive:463532kB present:1032192kB pages_scanned:2199846
all_unreclaimable? yes
Jan 9 00:02:40 vertical kernel: [179692.069438] lowmem_reserve: 0 0 0
0
Jan 9 00:02:40 vertical kernel: [179692.069442] Node 0 DMA: 2*4kB 4*8kB
6*16kB 3*32kB 4*64kB 3*128kB 4*256kB 1*512kB 3*1024kB 2*2048kB 0*4096kB
= 9576kB
Jan 9 00:02:40 vertical kernel: [179692.069453] Node 0 DMA32: 993*4kB
1*8kB 1*16kB 3*32kB 4*64kB 2*128kB 3*256kB 1*512kB 0*1024kB 0*2048kB
1*4096kB = 9980kB
Jan 9 00:02:40 vertical kernel: [179692.069464] Node 0 Normal: 34*4kB
2*8kB 4*16kB 1*32kB 1*64kB 1*128kB 2*256kB 0*512kB 1*1024kB 0*2048kB
0*4096kB = 1976kB
Jan 9 00:02:40 vertical kernel: [179692.069475] 1245 total pagecache
pages
Jan 9 00:02:40 vertical kernel: [179692.069476] 0 pages in swap cache
Jan 9 00:02:40 vertical kernel: [179692.069479] Swap cache stats: add
3126186, delete 3126186, find 729339/846430
Jan 9 00:02:40 vertical kernel: [179692.069481] Free swap = 0kB
Jan 9 00:02:40 vertical kernel: [179692.069483] Total swap = 7813468kB
Jan 9 00:02:40 vertical kernel: [179692.088001] 1048576 pages RAM
Jan 9 00:02:40 vertical kernel: [179692.088003] 40728 pages reserved
Jan 9 00:02:40 vertical kernel: [179692.088005] 837 pages shared
Jan 9 00:02:40 vertical kernel: [179692.088006] 1000697 pages
non-shared
Jan 9 00:27:46 vertical -- MARK --

Nikos Alexandris wrote:

Attempting to clean a very big vector map (after patching and before
dissolving) ended without success. It's about all CORINE tiles to form
the European-wide land cover map.

The process was running for 2 days now and killed itself for reasons I
don't understand. The machine I currently work with has 4GB of RAM and
8GB of swap memory, not to mention the free hard disk space.

# the map is big!
GRASS 6.4.svn (corine):/geo/grassdb/europe/corine/PERMANENT/vector > ls -lah corine

-rw-r--r-- 1 nik nik 1.9G 2009-01-06 10:01 coor

Ouch.

-rw-r--r-- 1 nik nik 647M 2009-01-06 09:59 topo

Ouch.

For vectors, you should assume that not only will you need to store
the entire map in memory, but the in-memory version may be
significantly larger than the underlying files due to the need to
store additional information related to the processing.

# cleaning...
GRASS 6.4.svn (corine):~ > v.clean corine out=corine_clean tool=snap,break,rmdupl thresh=.01
Killed

-The "w" was pressed by me accidentally. But I assume that it has little
to do with the process being killed.

Correct.

-Additionally, I was watching from time to time the process's status via
"top".

What on earth killed this "important" process?

It was most likely terminated due to excessive resource usage, either
for exceeding its own specified limits (see "ulimit -a") or for
depleting system-wide resources to the extent that the kernel killed
it to protect overall system integrity.

How should on go about and clean this map?

On a 64-bit system with a lot of RAM. A 32-bit system limits each
process to a 4GiB address space, some of which is reserved.

--
Glynn Clements <glynn@gclements.plus.com>

On Fri, 2009-01-09 at 15:57 +0000, Glynn Clements wrote:

Nikos Alexandris wrote:

> Attempting to clean a very big vector map (after patching and before
> dissolving) ended without success. It's about all CORINE tiles to form
> the European-wide land cover map.
>
> The process was running for 2 days now and killed itself for reasons I
> don't understand. The machine I currently work with has 4GB of RAM and
> 8GB of swap memory, not to mention the free hard disk space.
>
> # the map is big!
> GRASS 6.4.svn (corine):/geo/grassdb/europe/corine/PERMANENT/vector > ls -lah corine

> -rw-r--r-- 1 nik nik 1.9G 2009-01-06 10:01 coor

Ouch.

> -rw-r--r-- 1 nik nik 647M 2009-01-06 09:59 topo

Ouch.

For vectors, you should assume that not only will you need to store
the entire map in memory, but the in-memory version may be
significantly larger than the underlying files due to the need to
store additional information related to the processing.

2xOuch's= not good :slight_smile:

Any smart workaround to work the european-wide CORINE?

> # cleaning...
> GRASS 6.4.svn (corine):~ > v.clean corine out=corine_clean tool=snap,break,rmdupl thresh=.01
> Killed

[...]

> What on earth killed this "important" process?

It was most likely terminated due to excessive resource usage, either
for exceeding its own specified limits (see "ulimit -a") or for
depleting system-wide resources to the extent that the kernel killed
it to protect overall system integrity.

> How should on go about and clean this map?

On a 64-bit system with a lot of RAM. A 32-bit system limits each
process to a 4GiB address space, some of which is reserved.

:slight_smile:

My system is 64-bit, CoreDuo 2,53GHz, and I run Ubuntu Intrepid 64-bit.
Unfortunately I can't install more RAM.

Will more swap space help?

Thanks Glynn.

Nikos Alexandris wrote:

> > How should on go about and clean this map?
>
> On a 64-bit system with a lot of RAM. A 32-bit system limits each
> process to a 4GiB address space, some of which is reserved.

:slight_smile:

My system is 64-bit, CoreDuo 2,53GHz, and I run Ubuntu Intrepid 64-bit.
Unfortunately I can't install more RAM.

Will more swap space help?

I doubt it.

If the process died because it tried to allocate more (virtual) memory
but failed, you would have gotten an error message from G_malloc().

The "Killed" message indicates that it was terminated (by SIGKILL) due
to either exceeding an explicit resource limit (you could try
increasing those limits, probably by editing /etc/security/limits.conf
then logging in again) or by consuming too much physical RAM (in this
case, you should see an "OOM" message from the kernel in the logs).

If it died because it exceeded the virtual memory limit (ulimit -v) or
the data segment limit (ulimit -d), then more swap would allow it to
run (but if the process' resident set exceeds physcial RAM and starts
using swap, it will run too slowly to be of any use).

So long as you already have some swap, adding more won't reduce
physical RAM usage. So if it died due to exceeding the RSS limit
(ulimit -m) or the kernel's OOM (out-of-memory) killer, more swap
won't help.

--
Glynn Clements <glynn@gclements.plus.com>

On Fri, Jan 9, 2009 at 10:07 PM, Glynn Clements
<glynn@gclements.plus.com> wrote:

Nikos Alexandris wrote:

> > How should on go about and clean this map?
>
> On a 64-bit system with a lot of RAM. A 32-bit system limits each
> process to a 4GiB address space, some of which is reserved.

:slight_smile:

My system is 64-bit, CoreDuo 2,53GHz, and I run Ubuntu Intrepid 64-bit.
Unfortunately I can't install more RAM.

Will more swap space help?

I doubt it.

I wonder if we have a memory leak in the vector library.
Compare
http://trac.osgeo.org/grass/ticket/14

If it is there bit small normal usage won't trigger it in a significant
way. But this huge map would do.

Markus

On Fri, 2009-01-09 at 21:07 +0000, Glynn Clements wrote:

Nikos Alexandris wrote:

> > > How should on go about and clean this map?

This question remains. Of course only with respect to a clean european
CORINE map.

Fortunately for now I need to sample-out some rectangles. I guess I'll
have to perform v.clean + v.dissolve after v.overlay.

[...]

> My system is 64-bit, CoreDuo 2,53GHz, and I run Ubuntu Intrepid 64-bit.
> Unfortunately I can't install more RAM.
> Will more swap space help?

I doubt it.

If the process died because it tried to allocate more (virtual) memory
but failed, you would have gotten an error message from G_malloc().

The "Killed" message indicates that it was terminated (by SIGKILL) due
to either exceeding an explicit resource limit (you could try
increasing those limits, probably by editing /etc/security/limits.conf
then logging in again) or by consuming too much physical RAM (in this
case, you should see an "OOM" message from the kernel in the logs).

If it died because it exceeded the virtual memory limit (ulimit -v) or
the data segment limit (ulimit -d), then more swap would allow it to
run (but if the process' resident set exceeds physcial RAM and starts
using swap, it will run too slowly to be of any use).

So long as you already have some swap, adding more won't reduce
physical RAM usage. So if it died due to exceeding the RSS limit
(ulimit -m) or the kernel's OOM (out-of-memory) killer, more swap
won't help.

Glynn, in the end of a previous post of mine [1] I have pasted the part
of /var/log/messages which I believe corresponds to the "Kill" of
v.clean. I read "oom" in two lines:

#1: Jan 9 00:02:40 vertical kernel: [179692.069294] main-menu invoked
oom-killer: gfp_mask=0x1201d2, order=0, oomkilladj=0

#2: Jan 9 00:02:40 vertical kernel: [179692.069312]
[<ffffffff802af87a>] oom_kill_process+0x9a/0x230

So RAM was(is?) the problem probably.
---

[1] http://lists.osgeo.org/pipermail/grass-user/2009-January/048231.html

Nikos Alexandris wrote:

> So long as you already have some swap, adding more won't reduce
> physical RAM usage. So if it died due to exceeding the RSS limit
> (ulimit -m) or the kernel's OOM (out-of-memory) killer, more swap
> won't help.

Glynn, in the end of a previous post of mine [1] I have pasted the part
of /var/log/messages which I believe corresponds to the "Kill" of
v.clean. I read "oom" in two lines:

#1: Jan 9 00:02:40 vertical kernel: [179692.069294] main-menu invoked oom-killer: gfp_mask=0x1201d2, order=0, oomkilladj=0

#2: Jan 9 00:02:40 vertical kernel: [179692.069312] [<ffffffff802af87a>] oom_kill_process+0x9a/0x230

So RAM was(is?) the problem probably.

It certainly looks like it.

The above is definitely the OOM-killer in action, and the timing
(relative to the "w" command) suggests that it's likely to be related.

--
Glynn Clements <glynn@gclements.plus.com>

Markus Neteler wrote:

>> > > How should on go about and clean this map?
>> >
>> > On a 64-bit system with a lot of RAM. A 32-bit system limits each
>> > process to a 4GiB address space, some of which is reserved.
>>
>> :slight_smile:
>>
>> My system is 64-bit, CoreDuo 2,53GHz, and I run Ubuntu Intrepid 64-bit.
>> Unfortunately I can't install more RAM.
>>
>> Will more swap space help?
>
> I doubt it.

I wonder if we have a memory leak in the vector library.
Compare
http://trac.osgeo.org/grass/ticket/14

If it is there bit small normal usage won't trigger it in a significant
way. But this huge map would do.

Possibly. But with a map this large, you don't need a leak. The raw
data will barely fit into memory, and any per-vertex, per-edge etc
data could easily push it over the limit.

AFAICT from the output and the code, it's dying in Vect_snap_lines().

Looking into it more, I don't think that it's a leak; I just think
that it's trying to store an "expanded" (i.e. bloated) version of a
2.7GiB map in RAM on a system which only has 4GiB.

E.g. for each line vertex, it stores a bounding rectangle (actually a
cube, 6 doubles, 48 bytes). If there are 122 million vertices and only
~2 million are centroids, that could be 120 million line segments,
which would be ~5.4GiB.

Then there's the vertices themselves, and it's storing a significant
fraction of those at 2*8+4 = 20 bytes each, which could consume
anything up to 2.4GiB (the extra 4 bytes per vertex accounts for the
difference to the size of the "coor" file).

Add onto that the additional data used for e.g. the current boundary
(which could be most of the map if it's a long, detailed stretch of
intricate coastline), new vertices created during snapping, other
housekeeping data etc and it could easily exceed RAM.

--
Glynn Clements <glynn@gclements.plus.com>

On Sat, 2009-01-10 at 08:07 +0000, Glynn Clements wrote:

Markus Neteler wrote:
> >> > On a 64-bit system with a lot of RAM. A 32-bit system limits each
> >> > process to a 4GiB address space, some of which is reserved.

> >> My system is 64-bit, CoreDuo 2,53GHz, and I run Ubuntu Intrepid 64-bit.
> >> Unfortunately I can't install more RAM.
> >>
> >> Will more swap space help?
> >
> > I doubt it.
>
> I wonder if we have a memory leak in the vector library.
> Compare
> http://trac.osgeo.org/grass/ticket/14
>
> If it is there bit small normal usage won't trigger it in a significant
> way. But this huge map would do.

Possibly. But with a map this large, you don't need a leak. The raw
data will barely fit into memory, and any per-vertex, per-edge etc
data could easily push it over the limit.

AFAICT from the output and the code, it's dying in Vect_snap_lines().

Looking into it more, I don't think that it's a leak; I just think
that it's trying to store an "expanded" (i.e. bloated) version of a
2.7GiB map in RAM on a system which only has 4GiB.

E.g. for each line vertex, it stores a bounding rectangle (actually a
cube, 6 doubles, 48 bytes). If there are 122 million vertices and only
~2 million are centroids, that could be 120 million line segments,
which would be ~5.4GiB.

Then there's the vertices themselves, and it's storing a significant
fraction of those at 2*8+4 = 20 bytes each, which could consume
anything up to 2.4GiB (the extra 4 bytes per vertex accounts for the
difference to the size of the "coor" file).

Add onto that the additional data used for e.g. the current boundary
(which could be most of the map if it's a long, detailed stretch of
intricate coastline), new vertices created during snapping, other
housekeeping data etc and it could easily exceed RAM.

In other words one needs a powerful *workhorse* to work with very big
maps. May I add in the osgeo-wiki [1] something like:

Very big vector maps (raster?) (>2GB?) require maximal amount of RAM
(>=6GB?).

---
[1] http://wiki.osgeo.org/wiki/GIS_workstation_setup_tips

On Sat, Jan 10, 2009 at 9:07 AM, Glynn Clements
<glynn@gclements.plus.com> wrote:

Markus Neteler wrote:
Possibly. But with a map this large, you don't need a leak. The raw
data will barely fit into memory, and any per-vertex, per-edge etc
data could easily push it over the limit.

AFAICT from the output and the code, it's dying in Vect_snap_lines().

Looking into it more, I don't think that it's a leak; I just think
that it's trying to store an "expanded" (i.e. bloated) version of a
2.7GiB map in RAM on a system which only has 4GiB.

E.g. for each line vertex, it stores a bounding rectangle (actually a
cube, 6 doubles, 48 bytes). If there are 122 million vertices and only
~2 million are centroids, that could be 120 million line segments,
which would be ~5.4GiB.

Related:
Remove bounding box from support structures (?)
http://trac.osgeo.org/grass/browser/grass/trunk/doc/vector/TODO#L89

Then there's the vertices themselves, and it's storing a significant
fraction of those at 2*8+4 = 20 bytes each, which could consume
anything up to 2.4GiB (the extra 4 bytes per vertex accounts for the
difference to the size of the "coor" file).

Would it be possible to develop a (rough) formula to estimate
the memory need? With Thomas Huld we did so for the new
r.sun and it's quite useful to pick the right computer before
launching a multiple day job (in case you have a choice of
course).

Add onto that the additional data used for e.g. the current boundary
(which could be most of the map if it's a long, detailed stretch of
intricate coastline), new vertices created during snapping, other
housekeeping data etc and it could easily exceed RAM.

Is this along the lines of the suggestion to break long lines?

http://trac.osgeo.org/grass/browser/grass/trunk/doc/vector/TODO#L242
242 v.in.ogr
243 --------
244 It would be useful to split long boundaries to smaller
245 pieces. Otherwise cleaning process can become very slow because
246 bounding box of long boundaries can overlap large part of the map (for
247 example outline around all areas) and cleaning process is checking
248 intersection with all boundaries falling in the bounding box.

I wonder how hard that is to implement (since we have the
v.split algorithm).

Markus

Nikos Alexandris wrote:

In other words one needs a powerful *workhorse* to work with very big
maps.

It depends upon the task. For vector maps, it looks that way. Not all
modules will have such requirements, but being unable to perform
snapping could be a problem.

Most (but not all) raster modules can handle arbitrarily large maps,
with memory consumption proportional to the size of a row rather than
the size of the entire map.

May I add in the osgeo-wiki [1] something like:

Very big vector maps (raster?) (>2GB?) require maximal amount of RAM
(>=6GB?).

It isn't quite that simple. E.g. maps with many small areas will have
relatively more centroids, which affect the size of the coordinate
file but don't affect the memory usage of "v.clean tool=snap".

--
Glynn Clements <glynn@gclements.plus.com>

Markus Neteler wrote:

Would it be possible to develop a (rough) formula to estimate
the memory need? With Thomas Huld we did so for the new
r.sun and it's quite useful to pick the right computer before
launching a multiple day job (in case you have a choice of
course).

The problem here is that it may vary with the data. I still don't
understand the algorithm all that well.

> Add onto that the additional data used for e.g. the current boundary
> (which could be most of the map if it's a long, detailed stretch of
> intricate coastline), new vertices created during snapping, other
> housekeeping data etc and it could easily exceed RAM.

Is this along the lines of the suggestion to break long lines?

http://trac.osgeo.org/grass/browser/grass/trunk/doc/vector/TODO#L242
242 v.in.ogr
243 --------
244 It would be useful to split long boundaries to smaller
245 pieces. Otherwise cleaning process can become very slow because
246 bounding box of long boundaries can overlap large part of the map (for
247 example outline around all areas) and cleaning process is checking
248 intersection with all boundaries falling in the bounding box.

I wonder how hard that is to implement (since we have the
v.split algorithm).

This would make matters worse for snapping, as it will increase the
number of vertices.

--
Glynn Clements <glynn@gclements.plus.com>

On Sun, 2009-01-11 at 12:32 +0000, Glynn Clements wrote:

Nikos Alexandris wrote:

> In other words one needs a powerful *workhorse* to work with very big
> maps.

It depends upon the task. For vector maps, it looks that way. Not all
modules will have such requirements, but being unable to perform
snapping could be a problem.

Most (but not all) raster modules can handle arbitrarily large maps,
with memory consumption proportional to the size of a row rather than
the size of the entire map.

> May I add in the osgeo-wiki [1] something like:
>
> Very big vector maps (raster?) (>2GB?) require maximal amount of RAM
> (>=6GB?).

It isn't quite that simple. E.g. maps with many small areas will have
relatively more centroids, which affect the size of the coordinate
file but don't affect the memory usage of "v.clean tool=snap".

So it's a (bit) complicated issue. Thank you for all the details.
Kind regards, Nikos