[GRASS5] dgtable vector library

Greg_Sepesi · September 9, 2003, 1:31am

Hello,

I've posted the dgtable vector library and demo app to

http://www.eduneer.com/gad/dg_src.tar.gz (2.0 Mbytes)

I'll keep it there a few weeks.

The README is listed below.

- - -

README
08 sep 2003
Greg Sepesi (sepesi@eduneer.com)

Definition
----------
dgtable: a directed graph structure based upon lookup tables

Directory Contents
------------------
dg: test data
dg/demo: test code for exercising dgtable's graphics, search, and
comp. geometry algorithms
dg/dgbezier: least squared error Bezier curve fitting (using Graphics
Gems, ISBN:0122861663)
dg/dgrtree: R-Tree code (using Toni Gutman, Michael Stonebrakier, and
Daniel Green's work)
dg/dgtable: dgtable structure and code

Demonstration
-------------
To run the dgtable demonstration, go to the dg/demo directory and
execute

make
./demo

Expected screen output:

building directed graph from '../hypso.ascii' ...
concatenating paths into a second directed graph ...
   1148 paths in 620 connected components before concatenation.
   729 paths in 620 connected components after.
writing and reading second directed graph ...
   4197896 bytes out; 4197896 bytes in
performing some computational geometry on points in path #5 (name:
'140.00') ...
   140 points in path; 11 points in its 2D convex hull.
   160 triangles in its delaunay triangulation.
fitting Bezier curves through concatenated paths ...
   85638 points in 729 paths before curve fitting.
   66594 points in 729 paths after.

Expected file output (see dg/demo/demo.c for context of data dumps):

trash.convexhull.txt
trash.curve3.svg
trash.delaunay.txt
trash.path2.dg
trash.path2.txt
trash.path.txt

Design Intent: a simpler dglib
------------------------------
Looking for an open-source, stand-alone directed graph library for a
Palm OS application, the GRASS 5.1 dglib vector library seemed ideal
even though there was work to be done on a few of dglib's modes.
Unfortunately after a couple days and several emails, I was not able to
understand enough of dglib to work on it. Two questions remained.

Why use a tree for vertex storage?
Why use two modes for RAM storage (i.e., FLAT and TREE)?

I then started writing dgtable using lookup tables instead of balanced
trees to store vertices, and using just one mode for RAM storage.

Implementation Results: simpler in some cases, but not in others
----------------------------------------------------------------
The interaction of multiple modes is difficult to document for code
designer(s) and unfortunately also difficult to learn for code
maintainers. However, as far as I can tell, in addition to dglib's two
modes for storing vertices (i.e., FLAT and TREE), it has two modes for
accessing vertices (i.e., TREE and CACHE) and three modes (referred to
as graph versions) for connecting them.

dgtable benefits from avoiding multiple modes. dgtable has no need for
a cache mode because it already efficiently accesses vertices in lookup
tables. dgtable also has only one storage mode and it is analogous to
dglib's FLAT mode.

On the other hand, the lookup tables of dgtable are not as dynamic as
the trees of dglib and this could be a big expense if an application's
directed graph frequently changes. I suppose a good compromise would be
to change table size in blocks, but I haven't implemented that yet.

Design
------
Referring to dg/dgtable/dgtable.h, a dgtable directed graph consists of

   - a table of paths (e.g., connected points along a road)
   - a table of vertices (i.e., unique path points)
   - a table of intersections (i.e., non-unique path points)
   - an R-Tree table that provides a spatial link to each 3 point
subsection of each path

Presently, the dgtable library applies this data structure in the
following directed graph algorithms (see dg/dgtable/dgsearch.c):

   - depth first search (uni- or bi-directional paths)
   - breadth first search (uni- or bi-directional paths)
   - minimum spanning tree (uni- or bi-directional paths)
   - shortest path search (uni- or bi-directional paths),

the following computational geometry algorithms (see
dg/dgtable/dggeo.c):

- 2D convex hull
- 2D Delaunay triangulation,

and the following graphics algorithm (see dg/dgbezier):

- least squared error Bezier curve fit.

- - -

I'm using dgtable in a couple projects. If there's interest in using it
in GRASS ver >= 5.1, I'll add the appropriate formatting and headers.

Greg

neteler · September 9, 2003, 9:51pm

Hi Greg,

On Mon, Sep 08, 2003 at 09:31:06PM -0400, Greg Sepesi wrote:

Hello,

I've posted the dgtable vector library and demo app to

http://www.eduneer.com/gad/dg_src.tar.gz (2.0 Mbytes)

I'll keep it there a few weeks.

Great - thanks for posting it!

The README is listed below.

[...]

I'm using dgtable in a couple projects. If there's interest in using it
in GRASS ver >= 5.1, I'll add the appropriate formatting and headers.

It would be really nice to compare functionality in GRASS ver >= 5.1.
If possible, yes, please add the appropriate formatting and headers.
At time the other DGLib seems to be a bit abandoned... (Roberto's
email address bounces).

We have some test cases (aka vector data) ready to try with your
implementation.

Best regards

Markus

Radim_Blazek · September 10, 2003, 10:11am

On Tuesday 09 September 2003 03:31, Greg Sepesi wrote:

Design Intent: a simpler dglib
------------------------------
Looking for an open-source, stand-alone directed graph library for a
Palm OS application, the GRASS 5.1 dglib vector library seemed ideal
even though there was work to be done on a few of dglib's modes.
Unfortunately after a couple days and several emails, I was not able to
understand enough of dglib to work on it.

I had the same problem, it is too complicated, this is realy the problem.

Two questions remained.

Why use a tree for vertex storage?
Why use two modes for RAM storage (i.e., FLAT and TREE)?

Roberto definitely had reasons for that, once he explained it to
me but I partially did not understand and partially forgot.
If I recall well FLAT was used to store graph on the disk while TREE
for computatio. Read FLAT and convert to TREE is faster than build
new TREE? Is it possible? (Not used in GRASS.)

   - depth first search (uni- or bi-directional paths)
   - breadth first search (uni- or bi-directional paths)
   - minimum spanning tree (uni- or bi-directional paths)
   - shortest path search (uni- or bi-directional paths),

Very important for most v.net* is 'short path cache (SPC)'. It means that
when it is necessary to calculate SP from one node to all/many others,
whole process is not repeated for each destination node and current
state of graph is stored until starting point is changed.
Performance with SPC is many times higher! Unfortunately there is
a bug in SPC in dglib. Is it SPC available in gdtable or could be implemented?

I'm using dgtable in a couple projects. If there's interest in using it
in GRASS ver >= 5.1, I'll add the appropriate formatting and headers.

There is interest, but I don't have time to work on that, do you think
that you could write alternative Vlib/net.c, put dgtable to GRASS CVS
and make it alternative graph library (conditional compilation)?
Later we can switch to dgtable as default.

Radim

Greg_Sepesi · September 10, 2003, 1:40pm

Radim Blazek wrote:

Roberto definitely had reasons for that, once he explained it to
me but I partially did not understand and partially forgot.
If I recall well FLAT was used to store graph on the disk while TREE
for computatio. Read FLAT and convert to TREE is faster than build
new TREE? Is it possible? (Not used in GRASS.)

Although writing the nodes to disk during a traversal of the TREE's
linked nodes would be simple, maybe the FLAT mode reduces file size.

> - depth first search (uni- or bi-directional paths)
> - breadth first search (uni- or bi-directional paths)
> - minimum spanning tree (uni- or bi-directional paths)
> - shortest path search (uni- or bi-directional paths),

Very important for most v.net* is 'short path cache (SPC)'. It means that
when it is necessary to calculate SP from one node to all/many others,
whole process is not repeated for each destination node and current
state of graph is stored until starting point is changed.
Performance with SPC is many times higher! Unfortunately there is
a bug in SPC in dglib. Is it SPC available in gdtable or could be implemented?

gdtable implements a O(V*log(V)) shortest path algorithm using the
priority-first-search described by Sedgewick (ISBN: 0201066734). I
think of the algorithm as a terrain being filled by a water source at
the starting point. The water flows to the lowest elevation (cost)
until it reaches the destination point. Each vertex keeps track of the
direction (i.e., dad vertex) water flowed into it ... allowing
recreation of the path back to the source. Presently the algorithm
stops when reaching the destination point, but it would be easy to "keep
filling" until covering the graph. I'm not sure what the cache is in
'short path cache', but dgtable stores the dad vertex in dgtable's
vertex table so that it remains available (for recreating the path back
to the source) until another dgtable search algorithm is executed.

> I'm using dgtable in a couple projects. If there's interest in using it
> in GRASS ver >= 5.1, I'll add the appropriate formatting and headers.

There is interest, but I don't have time to work on that, do you think
that you could write alternative Vlib/net.c, put dgtable to GRASS CVS
and make it alternative graph library (conditional compilation)?
Later we can switch to dgtable as default.

I'll take a look at Vlib/net.c sometime this weekend.

Greg

Radim_Blazek · September 11, 2003, 2:59pm

On Wednesday 10 September 2003 15:40, Greg Sepesi wrote:

gdtable implements a O(V*log(V)) shortest path algorithm using the
priority-first-search described by Sedgewick (ISBN: 0201066734). I
think of the algorithm as a terrain being filled by a water source at
the starting point. The water flows to the lowest elevation (cost)
until it reaches the destination point. Each vertex keeps track of the
direction (i.e., dad vertex) water flowed into it ... allowing
recreation of the path back to the source. Presently the algorithm
stops when reaching the destination point, but it would be easy to "keep
filling" until covering the graph.

It should stop when the destination point is reached, keep the status
and continue filling if new starting point is the same and destination is not yet
reached.

I'm not sure what the cache is in
'short path cache', but dgtable stores the dad vertex in dgtable's
vertex table so that it remains available (for recreating the path back
to the source) until another dgtable search algorithm is executed.

I am not sure, but I think that in dglib all info about SP is stored in
the structure separate from graph and this structure may be optionaly stored.
It is important don't mix SP info with input graph, because then the same
graph may be used for more threads at the same time. I think that it shoud be possible
to extend v.net.path to server, wayting for requests from - to and
returning list of arcs.

Radim

Greg_Sepesi · September 11, 2003, 5:15pm

Hi Radim,

Radim Blazek wrote:

On Wednesday 10 September 2003 15:40, Greg Sepesi wrote:
> gdtable implements a O(V*log(V)) shortest path algorithm using the
> priority-first-search described by Sedgewick (ISBN: 0201066734). I
> think of the algorithm as a terrain being filled by a water source at
> the starting point. The water flows to the lowest elevation (cost)
> until it reaches the destination point. Each vertex keeps track of the
> direction (i.e., dad vertex) water flowed into it ... allowing
> recreation of the path back to the source. Presently the algorithm
> stops when reaching the destination point, but it would be easy to "keep
> filling" until covering the graph.

It should stop when the destination point is reached, keep the status
and continue filling if new starting point is the same and destination is not yet
reached.

That is a good idea for shortest path when it is frequently called.
Just for my curiosity, how does GRASS use it? As an implementation
note, the priority-first search algorithm also easily handles
disconnected subgraphs. So if the starting point is the same and the
destination is not yet reached AND the subgraph connected to (i.e.,
reachable from) the starting point isn't already filled, keep filling.
I'll have to think more about what it will take to implement a reentrant
status (as you suggest below). I know it needs to include the dgtable
heap (i.e., priority queue, which is used in the shortest path algorithm
for storing/sorting the "visit fringe" vertices, which are the vertices
not yet filled but are neighboring a filled vertex) and the arrays it
uses for results (presently part of the vertex array).

> I'm not sure what the cache is in
> 'short path cache', but dgtable stores the dad vertex in dgtable's
> vertex table so that it remains available (for recreating the path back
> to the source) until another dgtable search algorithm is executed.

I am not sure, but I think that in dglib all info about SP is stored in
the structure separate from graph and this structure may be optionaly stored.
It is important don't mix SP info with input graph, because then the same
graph may be used for more threads at the same time. I think that it shoud be possible
to extend v.net.path to server, wayting for requests from - to and
returning list of arcs.

Radim

Yes I believe dglib stores results in their own tree. Once I get a
reentrant status, I could add the ability to optionally save it to
disk. This option provides a good debug/test point. However is there a
need to read status? Does dglib read as well as write its result trees?

Greg

Greg_Sepesi · September 14, 2003, 9:34pm

Hello,

I have updated dgtable at

http://www.eduneer.com/gad/dg_src_0.1.tar.gz (847848 bytes)

The new version

   - has GPL notice in source files
   - has DirectedGraphStateType so dgtable can be called from multiple
threads
   - implements shortest path search continuation
   - demonstrates shortest path in dgdemo.out
   - contains road data for shortest path demonstration

The new version does not have

- the extraneous Visual C++ files I accidently included before
- the fast Delaunay triangulation (#ifdef'd out because I have not
finished debugging it)

I looked at Vlib/net.c, but it will probably be several months before I
can switch to GRASS ver >= 5.1 and offer help with GRASS
integration/maintenance. Meanwhile I will support dgtable by responding
to questions and problems or writing documentation. Hopefully some part
of dgtable will be useful to GRASS.

Greg

P.S., the README is listed below.

- - -

README
14 sep 2003
Greg Sepesi (sepesi@eduneer.com)

Definition
----------
dgtable: a directed graph structure based upon lookup tables

Directory Contents
------------------
dg: test data
dg/demo: test code for exercising dgtable's graphics, search, and
comp. geometry algorithms
dg/dgbezier: least squared error Bezier curve fitting (using Graphics
Gems, ISBN:0122861663)
dg/dgrtree: R-Tree code (using Toni Gutman, Michael Stonebrakier, and
Daniel Green's work)
dg/dgtable: dgtable structure and code

Demonstration
-------------
To run the dgtable demonstration, go to the dg/demo directory and
execute

make
./dgdemo.out

Expected screen output:

USING HYPSOGRAPHY DATA
building directed graph ...
concatenating paths into a second directed graph ...
   1148 paths in 620 connected components before concatenation.
   730 paths in 620 connected components after.
writing and reading second directed graph ...
   3176780 bytes out; 3176780 bytes in
performing some computational geometry on points in path 5 (i.e.,
"140.00") ...
   140 points in path; 11 points in its 2D convex hull.
   160 triangles in its delaunay triangulation.
fitting Bezier curves through concatenated paths ...
   85641 points in 730 paths before curve fitting.
   66581 points in 730 paths after.

USING PRIMARY ROAD DATA
building directed graph ...
concatenating paths into a second directed graph ...
   516 paths in 1 connected component before concatenation.
   27 paths in 1 connected component after.
calculating shortest path ...
   from intersection of Boonsboro Rd and Lynchburg Expy
   to intersection of Memorial Ave and Oakley Ave
   shortest path visits 81 vertices (different source, new search)

   from intersection of Boonsboro Rd and Lynchburg Expy
   to intersection of Campbell Ave and Richmond Hwy
   shortest path visits 135 vertices (same source, continued search)

   from intersection of Forest Rd and Lakeside Dr
   to intersection of Campbell Ave and Richmond Hwy
   shortest path visits 104 vertices (different source, new search)

   from intersection of Forest Rd and Lakeside Dr
   to intersection of Memorial Ave and Oakley Ave
   shortest path visits 42 vertices (same source, no search)

fitting Bezier curves through concatenated paths ...
706 points in 27 paths before curve fitting.
253 points in 27 paths after.

Expected file output (see dg/demo/dgdemo.c for context of data dumps):

trash.convexhull.txt trash.path2.dg trash.path.txt
trash.sp13.txt
trash.curve3.road.svg trash.path2.road.txt trash.sp02.txt
trash.curve3.svg trash.path2.txt trash.sp03.txt
trash.delaunay.txt trash.path.road.txt trash.sp12.txt

Design Intent: a simpler dglib
------------------------------
Looking for an open-source, stand-alone directed graph library for a
Palm OS application, the GRASS 5.1 dglib vector library seemed ideal
even though there was work to be done on a few of dglib's modes.
Unfortunately after a couple days and several emails, I was not able to
understand enough of dglib to work on it. Two questions remained.

Why use a tree for vertex storage?
Why use two modes for RAM storage (i.e., FLAT and TREE)?

I then started writing dgtable using lookup tables instead of balanced
trees to store vertices, and using just one mode for RAM storage.

Implementation Results: simpler in some cases, but not in others
----------------------------------------------------------------
dgtable benefits from avoiding multiple modes. dgtable has no need for
a cache mode because it already efficiently accesses vertices in lookup
tables. dgtable also has only one storage mode and it is analogous to
dglib's FLAT mode.

On the other hand, the lookup tables of dgtable are not as dynamic as
the trees of dglib and this could be a big expense if an application's
directed graph frequently changes. I suppose a good compromise would be
to change table size in blocks, but I haven't implemented that yet.

Design
------
Referring to dg/dgtable/dgtable.h, a dgtable directed graph consists of

   - a table of paths (e.g., connected points along a road)
   - a table of vertices (i.e., unique path points)
   - a table of intersections (i.e., non-unique path points)
   - an R-Tree table that provides a spatial link to each 3 point
subsection of each path

Presently, the dgtable library applies this data structure in the
following directed graph algorithms (see dg/dgtable/dgsearch.c):

   - depth first search (uni- or bi-directional paths)
   - breadth first search (uni- or bi-directional paths)
   - minimum spanning tree (uni- or bi-directional paths)
   - shortest path search (uni- or bi-directional paths),

the following computational geometry algorithms (see
dg/dgtable/dggeo.c):

- 2D convex hull
- 2D Delaunay triangulation,

and the following graphics algorithm (see dg/dgbezier):

- least squared error Bezier curve fit.

Radim_Blazek · September 15, 2003, 8:27am

On Thursday 11 September 2003 19:15, Greg Sepesi wrote:

Hi Radim,
> It should stop when the destination point is reached, keep the status
> and continue filling if new starting point is the same and destination is
> not yet reached.

That is a good idea for shortest path when it is frequently called.
Just for my curiosity, how does GRASS use it?

It is used in most v.net.* modules, where costs to more destination nodes are
required (v.net.iso, v.net.alloc, v.net.steiner, v.net.salesman). GRASS modules
do not store the status, they only know that to repeated call to SP from the same node
is much faster than when starting node is changed.
Status is stored by vector library in GRASS.

Yes I believe dglib stores results in their own tree. Once I get a
reentrant status, I could add the ability to optionally save it to
disk. This option provides a good debug/test point. However is there a
need to read status? Does dglib read as well as write its result trees?

I did not mean save to disk but store in the memory, before a next call to SP
function from the same node. Sorry for confussion. Important is, that more threads
can use the same input graph at the same time.

Radim