[GRASS-dev] [SoC] Status Report: Reimplementation of v.voronoi and v.delaunay in GRASS GIS

Martin_Pavlovsky · June 29, 2008, 2:56pm

Hi all,

First of all, I would like to apologise for not sending the report on Friday. This week was very busy for me, since I was moving away from England and I needed to sort various matters before I left, which kept me immensely occupied.

However, I had a fruitful conversation with Paul Kelly(my mentor). We were discussing algorithms I work with in detail. I have found out that the best way to fully understand something is to try to explain it to somebody else.

I feel that I have acquired enough theoretical knowledge so that I can implement Guibas-Stolfi algorithm by the end of next week.

I know that many of you are really interested in the progress of SoC projects and some of you are even willing to give a hand to SoC students. For those who are interested, I attach a short summary of the conversation with my mentor below, which might be helpful to anybody who wants to be put on the right track about the progress of the project.

Wolf Bergenheim recommended to me to have a look at Delaunay triangulations in TIN creation: an overview and a linear-time algorithm (http://tinyurl.com/3s4s4h). Unfortunately, my college is not subscribed to informaworld. If anybody manages to get an electronic version of this paper, could you send it to me. If not, then don’t worry about that. I have plenty of other resources dealing with algorithms used for construction of DT&VD. I have uploaded most of it here (7MB). You can have a look at those papers if you wish to. One of the disadvantage of some of those algorithms is that they are not so widely used and only a few implementations can be found online. The algorithms are also presented at a quite high theoretical level, which leaves a lot of implementation decisions on programmer, which is blessing and curse at the same time. They also require you to understand the theory used behind (mainly quad-edge data structure), which is a nice challenge.

I am primarily focusing on two particular algorithms. Divide&Conquer algorithm by Guibas-Stolfi and plane-sweep algorithm by Fortune. They both run in N log N worst case time. D&C algorithm was initially presented using quad-edge data structure, which has many nice properties. For instance, every single edge stores also its dual edge, which means that a dual graph of the stored graph can be easily obtained in linear time. In practice if you have constructed DT you can get VD in linear time just by flipping every edge to its dual, and vice versa for VD->DT(using the fact that DT and VD are dual to each other). Another nice property is that topological representation is entirely separated from geometrical one. Therefore altering one keeps the other intact. Which I consider elegant from programmer’s point of view. However, those properties make this data structure quite heavy in terms of memory usage, as Wolf have correctly pointed out. He also suggested to use winged-edge instead, which is much less memory demanding. Incidentally, one of the papers that is in the bundle I have given you above, named Improvements to Guibas-Stolfi and Fortune.pdf have proposed the same enhancement. This was driven mainly by observation that traditional implementation of Guibas-Stolfi with quad-edge spends about 55% of the time of computation on various edge manipulating procedures. In practice, winged-edge version of G-S uses less memory and is faster than its quad-edge counterpart. Yet this costs us an option to easily switch between graph and its dual. I plan to implement two versions of G-S algorithm. At first I want to implement quad-edge version, since this one was described in original paper and should be straightforward. Then I will make another version with winged-edge. I plan to give user an option as an optional parameter to choose if he/she wants to be able to obtain also dual graph, if yes then quad-edge will be used, otherwise winged-edge. Default should be winged-edge for efficiency reasons.

In short, G-S works like this:

Sort sites along x-axis with ties resolved by y-coordinate (this makes all future splittings constant time operations)
Recursively divide the sites into left(L) and right(R) vertical slips. If a current region contains 2 or 3 sites, construct DT directly, otherwise recurse further.
Merge L-DT and R-DT (widely uses inCircle test)

The most complex of those steps is merge, which i am not going to explain, since it has many substeps. It is nicely described (with pictures) in Guibas-Stolfi divide-and-conquer.pdf in the bundle above.

One major improvement to G-S is due by Dwyer, which improves expected time to N log log N for large number of distributions including uniform distribution in unit square. Dwyer advocates splitting a set of sites into cells = squares with sides of length sqrt((log N)/N) (assuming the sites are in unit square). Then you construct DT in each cell using traditional G-S algorithm. In the next step you start merging DTs in cells in rows using exactly the same method used for merging in G-S algorithm. You merge(+) cells in this order: (0+1, 2+3, 4+5, …, n+(n+1), …) storing results in cells (0, 2, 4, …) then you merge (0+2, 4+6, …, (2k)+(2k+2), …) storing results in (0, 4, 8, …). You repeat this. At the end, a triangulation of each row is stored in the cell at position 0 in that row. Now you start merging rows in similar fashion. However, slightly modified version of the merge proceduce has to be used this time, since you are merging along bottom/top edges. Final DT can be find in the cell (0,0).

Nonetheless, modified version of the above algorithm(name it DW2) is implemented in practice, but the above is used to describe how the algorithm works and for analysis of its efficiency.
In DW2 you split the set of N sites into sqrt(N/(log N)) slips by line parallel to x-axis. Then you apply G-S to each slip. Remember that G-S splits sites recursively into L and R halves by line parallel to y-axis - this conspicuously resembles the above algorithm, but is somewhat easier to implement. Next step is the same as in algorithm above, you merge rows. Efficiency is rougly the same. Both run in expected time N log log N. If I have enough time I would like to implement DW2.

There are many other enhancements and speed ups mentioned in Improvements to Guibas-Stolfi and Fortune.pdf One notable is an improved inCircle test which is largely used in the Merge procedure. Usual version of computing determinant which after elimination of common subexpressions takes 45 arithmetic operations is improved to only 23 operations by using the test based on y-coordinate of circumcircle centers. This can be further improved by caching, since current top-most cross edge between L and R slips remains the same for one iteration, some of the subexpressions remain constant and precalculating them can decrease the number of operations of inCircle to 15.

I have yet to study Fortune’s plane-sweep algorithm in more detail. There are many improvements applicable to plane-sweep algorithm also.

Regards,
Martin Pavlovsky