So I checked with current CVS and the same problem still applies to r.to.vect.
"r.to.vect -z feature=point input=dem_5 output=dem_5_pt" eats up all 1GB RAM +
1GB SWAP at about 5 000 000 points.
The above mentioned Andrew Danner's fix for v.in.ascii is great stuff but
r.to.vect problem remains (in my bug report I was complaining about only
r.to.vect, few days later Hamish changed the subject, as v.in.ascii issue
popped up during discussion).
Is it possible that r.to.vect suffers from a similar problem as v.in.ascii
did, so a similar fix would do? Andrew?
Maciek
-------------------------------------------- Managed by Request Tracker
My initial guess is that r.to.vect suffers from a similar bug/feature
that plagued v.in.ascii awhile ago and that is the building of vector
topology. As it stands now, r.to.vect calls Vect_build after processing
all the features and this is the same function call that ate all the
memory in v.in.ascii. The solution for v.in.ascii was to add another
flag "-b" to skip topology building in points mode. The rest of the
r.to.vect code looks pretty clean and I don't immediately expect memory
leaks. Radim has said many times that there are not leaks in the
Vect_build code, but the memory requirements are high for the topology
building. Without looking into the Vect_build code, I tend to believe
Radim, so if you want to extract 5 000 000 points from a raster you will
need to skip the topology. Note that many vector modules are not able to
use vector layers without topology (v.surf.rst being the primary
execption), so the "-b" flag is more of a workaround than a long term
solution.
I haven't had a chance to look into the Vect_build code and see if
there is a way to reduce memory usage. Is there any white paper or
technical specs on how the new vector library is organized and what the
vector topology looks like?
-Andy
On Wed, 2006-07-05 at 20:39 +0200, Maciek Sieczka via RT wrote:
So I checked with current CVS and the same problem still applies to r.to.vect.
"r.to.vect -z feature=point input=dem_5 output=dem_5_pt" eats up all 1GB RAM +
1GB SWAP at about 5 000 000 points.
The above mentioned Andrew Danner's fix for v.in.ascii is great stuff but
r.to.vect problem remains (in my bug report I was complaining about only
r.to.vect, few days later Hamish changed the subject, as v.in.ascii issue
popped up during discussion).
Is it possible that r.to.vect suffers from a similar problem as v.in.ascii
did, so a similar fix would do? Andrew?
Maciek
-------------------------------------------- Managed by Request Tracker
On Wed, Jul 05, 2006 at 03:09:02PM -0400, Andrew Danner wrote:
...
I haven't had a chance to look into the Vect_build code and see if
there is a way to reduce memory usage. Is there any white paper or
technical specs on how the new vector library is organized and what the
vector topology looks like?
there is a document here (part of the programmer's manual):
That what I could extract from Radim and sketch up
It's generated from the (aprtially) doxygenized source code.
Markus
-Andy
On Wed, 2006-07-05 at 20:39 +0200, Maciek Sieczka via RT wrote:
> Markus wrote:
>
> > New patch submitted, see
> >
> > https://intevation.de/rt/webrt?serial_num=3354&display=History
> >
> > Does it solve the problem?
>
> So I checked with current CVS and the same problem still applies to r.to.vect.
>
> "r.to.vect -z feature=point input=dem_5 output=dem_5_pt" eats up all 1GB RAM +
> 1GB SWAP at about 5 000 000 points.
>
> The above mentioned Andrew Danner's fix for v.in.ascii is great stuff but
> r.to.vect problem remains (in my bug report I was complaining about only
> r.to.vect, few days later Hamish changed the subject, as v.in.ascii issue
> popped up during discussion).
>
> Is it possible that r.to.vect suffers from a similar problem as v.in.ascii
> did, so a similar fix would do? Andrew?
>
> Maciek
>
>
> -------------------------------------------- Managed by Request Tracker
--
Markus Neteler <neteler itc it> http://mpa.itc.it/markus/
ITC-irst - Centro per la Ricerca Scientifica e Tecnologica
MPBA - Predictive Models for Biol. & Environ. Data Analysis
Via Sommarive, 18 - 38050 Povo (Trento), Italy
I first thought that your v.in.ascii fix enabled v.in.ascii to load
huge datasets *not* skipping the topology. After your clarification now
I see it was my mistake. Sorry if confusing.
Although I realize how important it is for many of us to be able to
load huge point datasets in any possible way, for now, like using this
no-topology hack, I hope there will one day be a real solution for
Grass to be able to process such big datasets in a normal, topological
way. Because propably the no-topology hack will be not suitable for
anything else besides points and propably we can't expect every single
vector module to be extended to support both non-topological and
topological vectors - also because there are GIS operations which
simply require a topological data model. The few 10^6 number of
features limit is a serious limitation in current Grass vector engine.
I wouldn't consider the bug solved, even regarding v.in.ascii alone.
But I do really appreciate all your effort towards making out as much
as possible of v.in.ascii for the moment. Thank you.
Maciek
--------------------
W polskim Internecie s? setki milion?w stron. My przekazujemy Tobie tylko najlepsze z nich! http://katalog.panoramainternetu.pl/
So I checked with current CVS and the same problem still applies to
r.to.vect.
"r.to.vect -z feature=point input=dem_5 output=dem_5_pt" eats up all
1GB RAM + 1GB SWAP at about 5 000 000 points.
The above mentioned Andrew Danner's fix for v.in.ascii is great stuff
but r.to.vect problem remains (in my bug report I was complaining
about only r.to.vect, few days later Hamish changed the subject, as
v.in.ascii issue popped up during discussion).
Is it possible that r.to.vect suffers from a similar problem as
v.in.ascii did, so a similar fix would do? Andrew?
does it happen during the "building lines" (or "registering lines"?)
step?
(watch the memory use using 'top' in another xterm, use "M" to sort by
memory use)
if so, it's the same problem as v.in.ascii building topology.
I added a -b flag to r.to.vect (in CVS) to skip building topology for
this reason. Only tested with raster cells->vector points in mind.
(r.in.xyz -> r.to.vect -> v.surf.rst)
On Fri, 7 Jul 2006 01:19:42 +1200
Hamish <hamish_nospam@yahoo.com> wrote:
Maciek Sieczka wrote:
> So I checked with current CVS and the same problem still applies to
> r.to.vect.
>
> "r.to.vect -z feature=point input=dem_5 output=dem_5_pt" eats up all
> 1GB RAM + 1GB SWAP at about 5 000 000 points.
>
> The above mentioned Andrew Danner's fix for v.in.ascii is great
> stuff but r.to.vect problem remains (in my bug report I was
> complaining about only r.to.vect, few days later Hamish changed the
> subject, as v.in.ascii issue popped up during discussion).
>
> Is it possible that r.to.vect suffers from a similar problem as
> v.in.ascii did, so a similar fix would do? Andrew?
does it happen during the "building lines" (or "registering lines"?)
step?
Yes.
if so, it's the same problem as v.in.ascii building topology.
I added a -b flag to r.to.vect (in CVS) to skip building topology for
this reason. Only tested with raster cells->vector points in mind.
(r.in.xyz -> r.to.vect -> v.surf.rst)
I'm not sure if this is right the way to go. If we proceed this way then
v.proj, v.in.*, v.perturb, v.to.points and other would require the
same. Do we want it? Double standards will be confusing, expecially for
newcommers. Shouldn't the vector engine be fixed instead not to use all
memory? Every no-topology hack will reduce the chance for a real
solution.
(On the other hand, sure I will bless your "r.to.vect -b" having no
other solution handy. But really this is not a sustainable approach.)
Maciek
--------------------
W polskim Internecie s? setki milion?w stron. My przekazujemy Tobie tylko najlepsze z nich! http://katalog.panoramainternetu.pl/
I'm not sure if this is right the way to go. If we proceed this way
then v.proj, v.in.*, v.perturb, v.to.points and other would require
the same. Do we want it? Double standards will be confusing,
expecially for newcommers. Shouldn't the vector engine be fixed
instead not to use all memory? Every no-topology hack will reduce the
chance for a real solution.
(On the other hand, sure I will bless your "r.to.vect -b" having no
other solution handy. But really this is not a sustainable approach.)
In principal I agree, in practice I am willing to compromise.
The -b flag is a temporary work-around until we have a better solution.
A pure solution is nice, but may take time and we have deadlines to
meet.
Or stated another way, I know enough of the vector code to add a -b flag
but not enough to rewrite the engine to fix the underlying problem. So I
add a -b flag and agree that a better solution is needed.
I was never very clear on this, but have an idea that topology is
meaningless for data which is only points (no tree; only bounding box
matters?). If so, the (correct) solution becomes much easier.
I was never very clear on this, but have an idea that topology is
meaningless for data which is only points (no tree; only bounding box
matters?). If so, the (correct) solution becomes much easier.
... nor me.
*If* topology is meaningless for point data, then we could add a test
in Vect_built() to
- check if only points are present in the map,
- if so, skip the topology creation.
A likewise test would be needed in the Vect_open() routine. Here the
question is if we can check beforehand that the map only contains
points and then ignore the topology (skip Vect_open_topo() in Vectlib?)
I was never very clear on this, but have an idea that topology is
meaningless for data which is only points (no tree; only bounding box
matters?). If so, the (correct) solution becomes much easier.
... nor me.
*If* topology is meaningless for point data, then we could add a test
in Vect_built() to
- check if only points are present in the map,
- if so, skip the topology creation.
this is not generaly a good solution - I will get back to this when I have more time -
it is good to read Radims document about what to do with the vector format
first before further engaging in this discussion - Maciek please read it -
that will give you a better idea what is going on.
Helena
A likewise test would be needed in the Vect_open() routine. Here the
question is if we can check beforehand that the map only contains
points and then ignore the topology (skip Vect_open_topo() in Vectlib?)
I was never very clear on this, but have an idea that topology is
meaningless for data which is only points (no tree; only bounding box
matters?). If so, the (correct) solution becomes much easier.
... nor me.
*If* topology is meaningless for point data, then we could add a test
in Vect_built() to
- check if only points are present in the map,
- if so, skip the topology creation.
this is not generaly a good solution - I will get back to this when I
have more time -
it is good to read Radims document about what to do with the vector
format
first before further engaging in this discussion - Maciek please read
it -
that will give you a better idea what is going on.
This is certainly a good idea. May I suggest that someone picks all the
pieces from the various (Radim et al.) emails and creates a Wiki page out
of that?
I was never very clear on this, but have an idea that topology is
meaningless for data which is only points (no tree; only bounding box
matters?). If so, the (correct) solution becomes much easier.
... nor me.
*If* topology is meaningless for point data, then we could add a test
in Vect_built() to
- check if only points are present in the map,
- if so, skip the topology creation.
this is not generaly a good solution - I will get back to this when I
have more time -
it is good to read Radims document about what to do with the vector
format
first before further engaging in this discussion - Maciek please read
it -
that will give you a better idea what is going on.
This is certainly a good idea. May I suggest that someone picks all the
pieces from the various (Radim et al.) emails and creates a Wiki page out
of that?
Markus - the better way would be to add the document that he has written about the next step
to do with vector support as a reference into http://mpa.itc.it/markus/grass61progman/Vector_Library.html
he has identified scalability as a main issue for vector support and suggests some solutions
(I believe it is the building of spatial index that is needed for topology building but potentially for
other things that needs to be modified - but I really don't want to go into this without reading it again).
As for the emails - most of it is just repeating the same thing over and over (I am starting
to be like Radim), although I have posted Radim's suggestion on how to modify
v.info and v.to.rast that has not been implemented yet and that might be useful (maybe add it to Radim's document)
--
Helena Mitasova
Department of Marine, Earth and Atmospheric Sciences
North Carolina State University
1125 Jordan Hall
NCSU Box 8208
Raleigh, NC 27695-8208 http://skagit.meas.ncsu.edu/~helena/
email: hmitaso@unity.ncsu.edu
ph: 919-513-1327 (no voicemail)
fax 919 515-7802
On Fri, Jul 07, 2006 at 11:54:34AM -0400, Helena Mitasova wrote:
Markus Neteler wrote:
>Helena Mitasova wrote on 07/07/2006 04:27 PM:
>
>>On Jul 7, 2006, at 7:48 AM, Markus Neteler wrote:
>>
>>>Hamish wrote on 07/07/2006 10:19 AM:
>>>
>>>>I was never very clear on this, but have an idea that topology is
>>>>meaningless for data which is only points (no tree; only bounding box
>>>>matters?). If so, the (correct) solution becomes much easier.
>>>>
>>>>
>>>... nor me.
>>>*If* topology is meaningless for point data, then we could add a test
>>>in Vect_built() to
>>>- check if only points are present in the map,
>>>- if so, skip the topology creation.
>>>
>>this is not generaly a good solution - I will get back to this when I
>>have more time -
>>it is good to read Radims document about what to do with the vector
>>format
>>first before further engaging in this discussion - Maciek please read
>>it -
>>that will give you a better idea what is going on.
>>
>
>This is certainly a good idea. May I suggest that someone picks all the
>pieces from the various (Radim et al.) emails and creates a Wiki page out
>of that?
>
Markus - the better way would be to add the document that he has written
about the next step
to do with vector support as a reference into http://mpa.itc.it/markus/grass61progman/Vector_Library.html
he has identified scalability as a main issue for vector support and
suggests some solutions
(I believe it is the building of spatial index that is needed for
topology building but potentially for
other things that needs to be modified - but I really don't want to go
into this without reading it again).
As for the emails - most of it is just repeating the same thing over and
over (I am starting
to be like Radim), although I have posted Radim's suggestion on how to
modify
v.info and v.to.rast that has not been implemented yet and that might be
useful (maybe add it to Radim's document)
Agreed - add it to Radim's document. At least a document.
Currently the info is scattered around and hard to find.
This is certainly a good idea. May I suggest that someone picks all
the pieces from the various (Radim et al.) emails and creates a Wiki
page out of that?
It would be good to keep Radim's comments quoted, versus merging Radim's
comments with my half-guesses etc.