[GRASS-dev] [INFO] Line simplification algorithm

FWIW, I seem to remember a thread on your list about the line
simplification algorithm, and the need of implementation of
Douglas/Peuker kind of algorithm.

ISTR too that when I used GRASS GPL (back in 2003), v.prune(1) was doing
its job the worst way possible (making local simplification, with a
sliding pair, i.e. cumulating errors by comparing a vertex against the
previous one and discarding the previous if it was in the threshold, and
then comparing with next, leading to the disappearance of the curves).

Looking at what I have in KerGIS (that is the original program with Dave
Gerdes implementation) I'm puzzled... since this _is_ a Douglas/Peuker
kind of algorithm with some slight modifications that make sense.

So if you want it, just go back to the origin (adapting to your new
vector engine).

One question remains: why and who discard a working program to replace
it with a disaster?

To be honest, working with the original sources, I know who I can trust
(developer or programmer name)---and I know I have (had) only
cosmetic work to fix the program---and who I can highly distrust---and
who has made such a poor job that is eating my time not in proportion
to his contribution (I will have spent 90% of the time fixing 10%
of the sources and know I will have to redo later everything from
scratch). Blunders may occur---been there, done that. But there
are very suspicious blunders...

YMMV,
--
Thierry Laronde (Alceste) <tlaronde +AT+ polynum +dot+ com>
                 http://www.kergis.com/
Key fingerprint = 0FF7 E906 FBAF FE95 FD89 250D 52B1 AE95 6006 F40C

On Mon, Nov 20, 2006 at 04:59:18PM +0100, tlaronde@polynum.com wrote:

Looking at what I have in KerGIS (that is the original program with Dave
Gerdes implementation) I'm puzzled... since this _is_ a Douglas/Peuker
kind of algorithm with some slight modifications that make sense.

To be technically more accurate, I have to explain what I refer to as a
"kind of Douglas/Peuker algorithm".

Strictly speaking it is _not_ a Douglas/Peuker algorithm, since this
very algorithm starts with the both ends of the line and works by
dichotomy. It is very efficient but costly since its big-Oh an
arrangement of the number of vertices.

Dave Gerdes's algorithm is O(n) and is accurate for what it was
designed for: pruning extra vertices from data coming from human digitizing.
It is what we can call a "tangential" algorithm, mimicking what a
careful and slow human digitizing could do.

It is a "kind" of same algorithm from a high level overview: trying to
rectify a curve by seing if the vertices are in a given width corridor,
with non-local (neighbouring) considerations (it does not consider
adjacent pairs, or triplets).

The original algorithm will discard less points than a Douglas/Peuker,
but shall be efficient for digitized data. Less efficient for vectorized
(raster to vector) due to the noise induced by pixelization.
--
Thierry Laronde (Alceste) <tlaronde +AT+ polynum +dot+ com>
                 http://www.kergis.com/
Key fingerprint = 0FF7 E906 FBAF FE95 FD89 250D 52B1 AE95 6006 F40C

tlaronde polynum.com wrote:

FWIW, I seem to remember a thread on your list about the line
simplification algorithm, and the need of implementation of
Douglas/Peuker kind of algorithm.

ISTR too that when I used GRASS GPL (back in 2003), v.prune(1) was
doing its job the worst way possible (making local simplification,
with a sliding pair, i.e. cumulating errors by comparing a vertex
against the previous one and discarding the previous if it was in the
threshold, and then comparing with next, leading to the disappearance
of the curves).

"v.prune" in GRASS 5 is essentially unchanged since moving into CVS in
1999:

http://freegis.org/cgi-bin/viewcvs.cgi/grass/src/mapdev/v.prune/v.prune.c.diff?r1=1.1&r2=1.3

(and the diff between GRASS 4.3 and 5.4 versions attached; not much
there either)

In GRASS 6 there is

vector/v.clean/prune.c by Radim Blazek
  which calls
lib/vector/Vlib/line.c
  Vect_line_prune() # removes duplicate verticies
  Vect_line_prune_thresh() # this calls dig_prune()

which leads to:

lib/vector/diglib/prune.c by Michel Wurtz for GRASS 4.2.1
  great header comments, Dave Gerdes is a "probable" in the copyright,
   but.."by Michel Wurtz" ??
  "This is a complete rewriting of the previous dig_prune subroutine."

there was a thread a couple of months ago about GRASS 6's v.clean pruning
problems and the meaning of thresh:
  http://thread.gmane.org/gmane.comp.gis.grass.user/15726

more on GRASS 4.3's version:

grass43/src/mapdev/v.prune/CHANGES:
-----------------------------------------------------------------
So for the benefit of the grass community, there is a new
dig_prune function, commented (in bad english), that should
replace src421/src/mapdev/diglib/prune.c (and not only for
linux version, the code is no linux dependant)

Also edited: src421/src/mapdev/v.prune/v.prune.c, replacing
the comparaison "==" by "<" in line 114 (test for Threshold != 0)
A null value just eliminates duplicate points.

The code is shorter, but the source bigger (many comments).

Michel Wurtz mw@engees.u-strasbg.fr Feb 18, 1998
-------------------------------------------------------------------

and these files-

$ grass43/src$ find | grep prune
./general/g.help/help/Commands.def/vprune.def
./general/g.help/help/17.manual/Help.pages/v.prune
./mapdev/libes/old_dev/prune.c # /* @(#)prune.c 2.1 6/26/87 */
                                     +/* added Feb 26, 1987 -dpg */
                                     +dig_prune() and dlg_prune()

./mapdev/diglib/prune.c # /* @(#)prune.c 3.0 2/19/98 */
                                      /* by Michel Wurtz for GRASS 4.2.1
./mapdev/diglib/prune.c.veryold # /* @(#)prune.c 2.1 6/26/87 */
                                        (without Feb '87 updates)

./mapdev/v.prune
./mapdev/v.prune/CHANGES # cropped section above
./mapdev/v.prune/Gmakefile
./mapdev/v.prune/prune.c # by Dave Gerdes 6/89
./mapdev/v.prune/v.prune.c # see attached diff to grass5
/*
** - Written by Dave Gerdes 6/89
** US Army Construction Engineering Research Lab
** - New parser installed, 12/90 David Stigberg
** - corrected line 115 (and rewritten ../diglib/prune.c) 2/98 Michel Wurtz
** - added v.support check line 233 Markus Neteler 7/98
*/

./mapdev/OLDlib/prune.c # /* @(#)prune.c 2.1 6/26/87 */
                      # bugfix better than ./mapdev/diglib/prune.c.veryold
./mapdev/v.out.moss/prune_points.c
./xgrass/xclip/v.prune
./xgrass/xhelp/17.manual/Help.pages/v.prune
./xgrass/xhelp/Commands.def/vprune.def
./tcltkgrass/module/v.prune

which version have you used?

Perhaps some insights in there, please let us know!

Looking at what I have in KerGIS (that is the original program with
Dave Gerdes implementation) I'm puzzled... since this _is_ a
Douglas/Peuker kind of algorithm with some slight modifications that
make sense.

So if you want it, just go back to the origin (adapting to your new
vector engine).

I'm not sure how far we are from there actually?

One question remains: why and who discard a working program to replace
it with a disaster?

maybe something in the above,

To be honest, working with the original sources, I know who I can
trust (developer or programmer name)---and I know I have (had) only
cosmetic work to fix the program---and who I can highly distrust---and
who has made such a poor job that is eating my time not in proportion
to his contribution (I will have spent 90% of the time fixing 10%
of the sources and know I will have to redo later everything from
scratch). Blunders may occur---been there, done that. But there
are very suspicious blunders...

You don't have to name the guilty if you don't want to :), but who do
you consider to be really good?

Hamish

(attachments)

vprune_43_54.diff (6.6 KB)

On Tue, Nov 21, 2006 at 03:31:52PM +1300, Hamish wrote:

"v.prune" in GRASS 5 is essentially unchanged since moving into CVS in
1999:

http://freegis.org/cgi-bin/viewcvs.cgi/grass/src/mapdev/v.prune/v.prune.c.diff?r1=1.1&r2=1.3

I'm talking about the main pruning routing dig_prune() to be found in
VECTORLIB.

Since dig_prune dated back 1987, and was unchanged during 10 years it is
unlikely that it was not correctly doing its work.

BTW, since I'm using memories from 3 years ago (I haven't used GRASS
GPL since 5.0) am I confusing v.prune with another module (v.clean)?
But I don't think so.
I think I used v.rm.dangles (or something like that), v.prune (and drop
it since the curves disappeared) and v.spag (that was not doing as much
work as it shall do).

I attach the original version of dig_prune.

grass43/src/mapdev/v.prune/CHANGES:
-----------------------------------------------------------------
So for the benefit of the grass community, there is a new
dig_prune function, commented (in bad english), that should
replace src421/src/mapdev/diglib/prune.c (and not only for
linux version, the code is no linux dependant)

Also edited: src421/src/mapdev/v.prune/v.prune.c, replacing
the comparaison "==" by "<" in line 114 (test for Threshold != 0)
A null value just eliminates duplicate points.

The code is shorter, but the source bigger (many comments).

Michel Wurtz mw@engees.u-strasbg.fr Feb 18, 1998
-------------------------------------------------------------------

That's puzzled me, since the 4.1 version is short, commented and make
sense. Does something happen between 4.1 and 4.2, that is without a
track in a CVS (not here?). Did someone mess with the code previously to
this change?
The "not only for linux version" is strange: there is nothing system
dependant in the original.

You don't have to name the guilty if you don't want to :), but who do
you consider to be really good?

OK :slight_smile: That's quite simple: you can trust the CERL (Michael Shapiro,
James Westervelt, Dave Gerdes). There has been programs in alpha coming
from outside the CERL. In this case...

BTW, some alpha programs for vector (namely v.spag and v.cutter) were
buggy. But this was not due to the incompetence of the programmer (well:
developer). This is an historic consequence: Dave Gerdes was in a hurry
just before the team was disbanded, and the alpha remained alpha.

FWIW, I have fixed v.spag(1). There were two main problems:

1) To be sure that the program ends, you must ensure that a line is
whether unchanged but never touched again, or split but resulting in
another line (not the same or you will enter an infinite loop). There
were cases where lines were cut in two new lines, a null length one, and
the remaining, that is the same as the previous;

2) Nodes created were not correctly snapped to existing node, leading to
the creation of several nodes with the exact same coordinates =>
visually, lines were connected, but not to the same node so without
snapping nodes areas could not be created.

There were, too, missing handling of colinear cases.

v.cutter(1) is an ambitious complex monster. I have chosen to replace it
with an extended version of v.spag(1) , since v.spag is trying to cut a
map agains itself when you think at it that is it's a particular case of
v.cutter.

Cheers,
--
Thierry Laronde (Alceste) <tlaronde +AT+ polynum +dot+ com>
                 http://www.kergis.com/
Key fingerprint = 0FF7 E906 FBAF FE95 FD89 250D 52B1 AE95 6006 F40C

(attachments)

prune.c (4.24 KB)

tlaronde@polynum.com wrote:

On Tue, Nov 21, 2006 at 03:31:52PM +1300, Hamish wrote:
>
> "v.prune" in GRASS 5 is essentially unchanged since moving into CVS
> in 1999:
> http://freegis.org/cgi-bin/viewcvs.cgi/grass/src/mapdev/v.prune/v.prune.c.diff?r1=1.1&r2=1.3

I'm talking about the main pruning routing dig_prune() to be found in
VECTORLIB.

Since dig_prune dated back 1987, and was unchanged during 10 years it
is unlikely that it was not correctly doing its work.

Other bugs have survived that long... unlikely but still possible.

BTW, since I'm using memories from 3 years ago (I haven't used GRASS
GPL since 5.0) am I confusing v.prune with another module (v.clean)?
But I don't think so.
I think I used v.rm.dangles (or something like that), v.prune (and
drop it since the curves disappeared) and v.spag (that was not doing
as much work as it shall do).

v.clean in GRASS 6 collected v.prune, v.spag, v.rm.dangles, etc in a
single module.
e.g. v.prune -> "v.clean tool=prune"
     v.spag -> "v.clean tool=break"

I attach the original version of dig_prune.

/* $Id: prune.c,v 1.7 2006/10/31 14:23:31 tlaronde Exp $
* Copyright 2004 Thierry LARONDE <tlaronde@polynum.com>
* All rights reserved.
*
* See the COPYRIGHTS at the root of the source directory.
*
*!!!THIS SOFTWARE IS PROVIDED ``AS IS'' WITHOUT ANY WARRANTIES!!!
* USE IT AT YOUR OWN RISK
********************************************************************/
/* @(#)prune.c 2.1 6/26/87 */

is this in fact the "original version" ?

which is that? from GRASS 4.1 source code? which file?

the earliest I have is 4.3, do you have a link to the 4.1 source code?

The history page on the website says:
"26. October 1999 GRASS 5.0 released under GNU GPL (Baylor University
and Markus Neteler)"

but grass4.3 source code states GPL in the COPYRIGHT file.
I seem to remember Markus saying the full GPL code audit happened for
GRASS 5, but 4.3 source has a "src.nonGPL/" dir already.

What about 4.2 (Baylor)? Is that the first version that mentions the
GPL? The 4.1.5 Linux port?

?

grass43/src/mapdev$ find | grep prune.c
./libes/old_dev/prune.c
./diglib/prune.c
./diglib/prune.c.veryold
./v.prune/prune.c
./v.prune/v.prune.c
./OLDlib/prune.c

grass43/src/mapdev$ \
for FILE in `find | grep prune.c` ; do
   diff /tmp/Thierry_prune.c $FILE | wc -l
done

361 libes/old_dev/prune.c
389 diglib/prune.c
327 diglib/prune.c.veryold
250 v.prune/prune.c
803 v.prune/v.prune.c
319 OLDlib/prune.c

v.prune/prune.c is totally different but the smallest because both files
are small.

I think both versions would need to be run through indent before that
test will give good results.
(eg src/mapdev/OLDlib/prune.c seems similar to your attachment)

OLDlib/prune.c and diglib/prune.c.veryold are almost the same,
(prune.c.veryold has a bugfix that OLDlib doesn't)

# versioning not much help:
grass43$ grep -rI "2.1 6/26/87 " * | grep prune
src/mapdev/libes/old_dev/prune.c:/* @(#)prune.c 2.1 6/26/87 */
src/mapdev/diglib/prune.c.veryold:/* @(#)prune.c 2.1 6/26/87 */
src/mapdev/OLDlib/prune.c:/* @(#)prune.c 2.1 6/26/87 */
src/vector/libes/old_dev/prune.c:/* @(#)prune.c 2.1 6/26/87 */
src/vector/diglib/prune.c.veryold:/* @(#)prune.c 2.1 6/26/87 */
src/vector/OLDlib/prune.c:/* @(#)prune.c 2.1 6/26/87 */

> grass43/src/mapdev/v.prune/CHANGES:
> -----------------------------------------------------------------
> So for the benefit of the grass community, there is a new
> dig_prune function, commented (in bad english), that should
> replace src421/src/mapdev/diglib/prune.c (and not only for
> linux version, the code is no linux dependant)
>
> Also edited: src421/src/mapdev/v.prune/v.prune.c, replacing
> the comparaison "==" by "<" in line 114 (test for Threshold != 0)
> A null value just eliminates duplicate points.
>
> The code is shorter, but the source bigger (many comments).
>
> Michel Wurtz mw@engees.u-strasbg.fr Feb 18, 1998
> -------------------------------------------------------------------

That's puzzled me, since the 4.1 version is short, commented and make
sense. Does something happen between 4.1 and 4.2, that is without a
track in a CVS (not here?). Did someone mess with the code previously
to this change?

this is before any CVS control I think, it would depend on discipline of
devels to bump the version number. As we see with Michel Wurtz's changes
above, that didn't always happen.

the modern grass CVS started in Dec 1999 (AFAIK), just after the GPL
audit (AFAIK) and the birth of the GRASS 5 development.
In the 4.3 source there is a "src/mapdev/v.clean/tags" file, so some
earlier version control existed?

The "not only for linux version" is strange: there is nothing system
dependant in the original.

I think that statement is just a no-op.

FWIW, I have fixed v.spag(1). There were two main problems:

(note due to topological nature of GRASS 6 [it is valid for 3D roads
with a bridge not to cross] this takes more thought?)

In GRASS 6 (parts of) v.spag is in lib/vector/Vlib/intersect.c
Vect_segment_intersection()

1) To be sure that the program ends, you must ensure that a line is
whether unchanged but never touched again, or split but resulting in
another line (not the same or you will enter an infinite loop). There
were cases where lines were cut in two new lines, a null length one,
and the remaining, that is the same as the previous;

2) Nodes created were not correctly snapped to existing node, leading
to the creation of several nodes with the exact same coordinates =>
visually, lines were connected, but not to the same node so without
snapping nodes areas could not be created.

There were, too, missing handling of colinear cases.

diff vs GRASS 4.1? or maybe better simple ASCII vector test case?
(Please :slight_smile: there have been many fixes since then of course:
http://freegis.org/cgi-bin/viewcvs.cgi/grass6/lib/vector/Vlib/intersect.c

I seem to remember someone seeing #2 in GRASS 6 recently. v.segment?

v.cutter(1) is an ambitious complex monster. I have chosen to replace
it with an extended version of v.spag(1) , since v.spag is trying to
cut a map agains itself when you think at it that is it's a particular
case of v.cutter.

replaced in GRASS 6 by v.overlay and v.select. Radim did a very nice job
on these I think.

all for now,
Hamish

Hello Hamish,

On Thu, Nov 23, 2006 at 01:11:14PM +1300, Hamish wrote:

v.clean in GRASS 6 collected v.prune, v.spag, v.rm.dangles, etc in a
single module.
e.g. v.prune -> "v.clean tool=prune"
     v.spag -> "v.clean tool=break"

Well, in this case this is not v.clean since I only used 5.0. Well and I
remember that I was not totally satisfied reading a source from somebody
at [smallest, smallest DBA systems].

> I attach the original version of dig_prune.

/* $Id: prune.c,v 1.7 2006/10/31 14:23:31 tlaronde Exp $
* Copyright 2004 Thierry LARONDE <tlaronde@polynum.com>
* All rights reserved.
*
* See the COPYRIGHTS at the root of the source directory.
*
*!!!THIS SOFTWARE IS PROVIDED ``AS IS'' WITHOUT ANY WARRANTIES!!!
* USE IT AT YOUR OWN RISK
********************************************************************/
/* @(#)prune.c 2.1 6/26/87 */

is this in fact the "original version" ?

Yes, don't be misleaded by the header (here the header of the internal
private copy, replaced by the KerGIS Public Licence when appearing in
the public copy). It's just a boilerplate added to the file without
removing any original copyright information.

Here, this was nothing more than this:
/* @(#)prune.c 2.1 6/26/87 */

and it was in src/mapdev/diglib/prune.c in the original (I get the cpio
files form a link from grass.itc.it web site 3 years ago; I don't know
if the link is still there).

And I'm based on official 4.1, the latest CERL original sources I found.

So it seems there has been a whole story before GRASS get into a CVS
(there is a gap between orphaned CERL and reorganized development, I
think 4.3).

OLDlib/prune.c and diglib/prune.c.veryold are almost the same,
(prune.c.veryold has a bugfix that OLDlib doesn't)

I think that prune.c.veryold (for you) is the likely candidate, since
there are some bugfixes from OLDlib/prune.c to diglib/prune.c (a
comment, and the particular case for "all but last point in thresh").

this is before any CVS control I think, it would depend on discipline of
devels to bump the version number. As we see with Michel Wurtz's changes
above, that didn't always happen.

BTW, I was not criticizing Michel Wurtz's algorithm (I haven't look at
it precisely). I seemed to recall that the version I used was using a
blind orthogonal distance algorithm that was producing non sense. And
since there was discussion about the simplification algorithms I
connected the two and was surprised to find that the original algorithm
(that does not handle the particular case of x = constant, but this
should only overflow and lead to not pruning) was already a sensible
rectification approach.

It will be hard to track back, and perhaps the simplest is to see what
works and what doesn't. If v.prune(1) or another program does not
exhibit the behavior I seem to remember, well the discussion has some
historic interest but that's all :wink:

> FWIW, I have fixed v.spag(1). There were two main problems:

(note due to topological nature of GRASS 6 [it is valid for 3D roads
with a bridge not to cross] this takes more thought?)

KerGIS is still 2D and the intersection is for 2D at the present. In
fact 3D is just an extension (principles will remain the same). The
handling of grouping (still categories in GRASS GPL terminology I think)
has to be done (for use in the generalized v.cutter(1)).

>
> 2) Nodes created were not correctly snapped to existing node, leading
> to the creation of several nodes with the exact same coordinates =>
> visually, lines were connected, but not to the same node so without
> snapping nodes areas could not be created.
>
> There were, too, missing handling of colinear cases.

diff vs GRASS 4.1? or maybe better simple ASCII vector test case?
(Please :slight_smile: there have been many fixes since then of course:
http://freegis.org/cgi-bin/viewcvs.cgi/grass6/lib/vector/Vlib/intersect.c

I can give them of course but it will be slightly complicated by the
fact that I have fixed some problems in the VECTORLIB too. There were 3
distinct values doing snapping, and there was one value used to
compute the angle (which was a fault, leading to inconsistent
"enumeration" of lines connected, leading to areas not created when
they were indeed correct---you saw the problem in v.digit, when looking
at connected lines from a node [debug mode, 'd'] lines being highlighted
counterclockwise and suddenly a leap being made to a not consecutive
line).

I think that the simplest, if someone is interested, would be to compare
directly with the KerGIS sources.

I seem to remember someone seeing #2 in GRASS 6 recently. v.segment?

> v.cutter(1) is an ambitious complex monster. I have chosen to replace
> it with an extended version of v.spag(1) , since v.spag is trying to
> cut a map agains itself when you think at it that is it's a particular
> case of v.cutter.

replaced in GRASS 6 by v.overlay and v.select. Radim did a very nice job
on these I think.

Probably. Making the comparison between still a 4.1 format, and the new
vector engine of GRASS GPL for 6.x will be slightly complicated.

Cheers,
--
Thierry Laronde (Alceste) <tlaronde +AT+ polynum +dot+ com>
                 http://www.kergis.com/
Key fingerprint = 0FF7 E906 FBAF FE95 FD89 250D 52B1 AE95 6006 F40C

tlaronde@polynum.com wrote on 11/23/2006 02:19 AM:
...

And I'm based on official 4.1, the latest CERL original sources I found.
So it seems there has been a whole story before GRASS get into a CVS
(there is a gap between orphaned CERL and reorganized development, I
think 4.3).
  
Right. CVS was started 31 Dec 1999 only.
BTW: Out of interest for you, but I received a copy of GRASS 3 (!)
recently and
added it to the server.

You will be interested in Radim's work to compare the various versions
of the
pre-CVS times:
http://mpa.itc.it/radim/g50history/

He made a file-by-file comparison, starting with 4.1.5.
Please report you findings if these pages are helpful for your study.

thanks,
Markus

Markus Neteler wrote on 11/23/2006 09:05 AM:

tlaronde@polynum.com wrote on 11/23/2006 02:19 AM:
...
  

And I'm based on official 4.1, the latest CERL original sources I found.
So it seems there has been a whole story before GRASS get into a CVS
(there is a gap between orphaned CERL and reorganized development, I
think 4.3).
  
Right. CVS was started 31 Dec 1999 only.
BTW: Out of interest for you, but I received a copy of GRASS 3 (!)
recently and
added it to the server.

You will be interested in Radim's work to compare the various versions
of the
pre-CVS times:
http://mpa.itc.it/radim/g50history/

He made a file-by-file comparison, starting with 4.1.5.
Please report you findings if these pages are helpful for your study.

The links to the old source code versions are here:
http://grass.itc.it/download/software_old.php

Markus