#2045: r.to.vect: use less memory
------------------------------+---------------------------------------------
Reporter: mlennert | Owner: grass-dev@…
Type: enhancement | Status: new
Priority: normal | Milestone: 7.0.0
Component: Raster | Version: svn-trunk
Keywords: r.to.vect memory | Platform: Unspecified
Cpu: Unspecified |
------------------------------+---------------------------------------------
Trying to convert a raster file to vector areas on a machine with 10GB,
the process was killed after having used up all memory and swap.
The CELL raster in question is an output of i.segment, is quite large and
has many small segments, i.e. many different raster values:
{{{
Rows: 53216
Columns: 49184
Total Cells: 2617375744
Although many of these cells are null:
total null cells: 2061717280
non-null cells: 555658464
}}}
It would be nice if r.to.vect could handle such large files without using
up so much memory.
#2045: r.to.vect: use less memory
------------------------------+---------------------------------------------
Reporter: mlennert | Owner: grass-dev@…
Type: enhancement | Status: new
Priority: normal | Milestone: 7.0.0
Component: Raster | Version: svn-trunk
Keywords: r.to.vect memory | Platform: Unspecified
Cpu: Unspecified |
------------------------------+---------------------------------------------
Comment(by mmetz):
Replying to [ticket:2045 mlennert]:
> Trying to convert a raster file to vector areas on a machine with 10GB,
the process was killed after having used up all memory and swap.
There was a memory leak in r.to.vect, fixed in trunk r57281.
#2045: r.to.vect: use less memory
------------------------------+---------------------------------------------
Reporter: mlennert | Owner: grass-dev@…
Type: enhancement | Status: new
Priority: normal | Milestone: 7.0.0
Component: Raster | Version: svn-trunk
Keywords: r.to.vect memory | Platform: Unspecified
Cpu: Unspecified |
------------------------------+---------------------------------------------
Comment(by mlennert):
Replying to [comment:1 mmetz]:
> Replying to [ticket:2045 mlennert]:
> > Trying to convert a raster file to vector areas on a machine with
10GB, the process was killed after having used up all memory and swap.
>
> There was a memory leak in r.to.vect, fixed in trunk r57281.
(and r57281)
Thanks ! I now get through the "Extracting areas..." part without the
process being killed. I still see continuous increase in memory usage,
though, up to 88,5% at the end of the extraction stage. Is this normal ?
#2045: r.to.vect: use less memory
------------------------------+---------------------------------------------
Reporter: mlennert | Owner: grass-dev@…
Type: enhancement | Status: new
Priority: normal | Milestone: 7.0.0
Component: Raster | Version: svn-trunk
Keywords: r.to.vect memory | Platform: Unspecified
Cpu: Unspecified |
------------------------------+---------------------------------------------
Comment(by mlennert):
Replying to [comment:2 mlennert]:
> Replying to [comment:1 mmetz]:
> > Replying to [ticket:2045 mlennert]:
> > > Trying to convert a raster file to vector areas on a machine with
10GB, the process was killed after having used up all memory and swap.
> >
> > There was a memory leak in r.to.vect, fixed in trunk r57281.
>
> (and r57281)
>
> Thanks ! I now get through the "Extracting areas..." part without the
process being killed. I still see continuous increase in memory usage,
though, up to 88,5% at the end of the extraction stage. Is this normal ?
>
> Now it's busy writing areas...
>
And memory usage continued to increase. The process was killed during the
"Registering primitives..." step.
#2045: r.to.vect: use less memory
------------------------------+---------------------------------------------
Reporter: mlennert | Owner: grass-dev@…
Type: enhancement | Status: new
Priority: normal | Milestone: 7.0.0
Component: Raster | Version: svn-trunk
Keywords: r.to.vect memory | Platform: Unspecified
Cpu: Unspecified |
------------------------------+---------------------------------------------
Comment(by mmetz):
Replying to [comment:3 mlennert]:
> Replying to [comment:2 mlennert]:
> > Replying to [comment:1 mmetz]:
> > > Replying to [ticket:2045 mlennert]:
> > > > Trying to convert a raster file to vector areas on a machine with
10GB, the process was killed after having used up all memory and swap.
> > >
> > > There was a memory leak in r.to.vect, fixed in trunk r57281.
> >
> > (and r57281)
> >
> > Thanks ! I now get through the "Extracting areas..." part without the
process being killed. I still see continuous increase in memory usage,
though, up to 88,5% at the end of the extraction stage. Is this normal ?
> >
> > Now it's busy writing areas...
> >
>
> And memory usage continued to increase. The process was killed during
the "Registering primitives..." step.
You can try to set the environment variable GRASS_VECTOR_LOWMEM before
running r.to.vect. GRASS_VECTOR_LOWMEM reduces the amount of memory used
by vector topology structures (the spatial index is built on disk).
IOW, r.to.vect might use quite a bit of memory which is difficult to
change, and the vector topology structures also need memory (in g7 much
less than in g6).
#2045: r.to.vect: use less memory
------------------------------+---------------------------------------------
Reporter: mlennert | Owner: grass-dev@…
Type: enhancement | Status: new
Priority: normal | Milestone: 7.0.0
Component: Raster | Version: svn-trunk
Keywords: r.to.vect memory | Platform: Unspecified
Cpu: Unspecified |
------------------------------+---------------------------------------------
Comment(by mlennert):
Replying to [comment:4 mmetz]:
> Replying to [comment:3 mlennert]:
> > Replying to [comment:2 mlennert]:
> > > Replying to [comment:1 mmetz]:
> > > > Replying to [ticket:2045 mlennert]:
> > > > > Trying to convert a raster file to vector areas on a machine
with 10GB, the process was killed after having used up all memory and
swap.
> > > >
> > > > There was a memory leak in r.to.vect, fixed in trunk r57281.
> > >
> > > (and r57281)
> > >
> > > Thanks ! I now get through the "Extracting areas..." part without
the process being killed. I still see continuous increase in memory usage,
though, up to 88,5% at the end of the extraction stage. Is this normal ?
> > >
> > > Now it's busy writing areas...
> > >
> >
> > And memory usage continued to increase. The process was killed during
the "Registering primitives..." step.
>
> You can try to set the environment variable GRASS_VECTOR_LOWMEM before
running r.to.vect. GRASS_VECTOR_LOWMEM reduces the amount of memory used
by vector topology structures (the spatial index is built on disk).
>
> IOW, r.to.vect might use quite a bit of memory which is difficult to
change, and the vector topology structures also need memory (in g7 much
less than in g6).
So you think the continuous increase up to 88.5% during the area
extraction is normal ?
I'll try the GRASS_VECTOR_LOWMEM option on Monday.
Just brainstorming:
I'm not sure I completely understand the module's workings, but IIUC, it
systematically goes through rows an columns and checks whether pixel
values change. But does that mean it reads all areas into memory before
writing them ? Would it be feasible / of interest to write each area
immediately after its boundaries have been identified and then free the
memory again ?
I could even imagine a solution where the module works with segments of
the map, writes out the areas identified in a segment, then goes on to the
next segment and then at the end use a v.dissolve equivalent to fusion all
neighboring areas with identical values. Unless the v.dissolve process
takes a lot of memory, I could imagine that at least the first part could
work with a low memory consumption, or ?
#2045: r.to.vect: use less memory
--------------------------+-------------------------
Reporter: mlennert | Owner: grass-dev@…
Type: enhancement | Status: new
Priority: normal | Milestone: 7.0.0
Component: Raster | Version: svn-trunk
Resolution: | Keywords: r.to.vect
CPU: Unspecified | Platform: Unspecified
--------------------------+-------------------------
Comment (by martin):
Hah, I tried to vectorize all of NLCD2011, which is 16832104560 cells in
size, in one rush and found this to be a highly impracticable approach.
The process (GRASS 7.1 trunk of late 2014) ran out of memory on a machine
with more than 100 GByte of main mem and GRASS_VECTOR_LOWMEM didn't make
any substantial difference.[[BR]]
Finally I decided to cut the job into many small tiles of approx. 33M
cells each (which represents approx. 2x2 degrees in North American AEA)
and this turned out to work pretty well - not only in the vectorization
but also in the subsequent vector post-processing stage.