Hi Markus,
On 01/01/15 22:18, Markus Metz wrote:
Hi all,
a new spatial index for point data is available in lib/btree2: a
multidimensional search tree, also known as k-d tree.
Excellent news.
What is it good for:
- nearest neighbor statistics: test if points are randomly
distributed. The current GRASS addon v.nnstat uses an external k-d
tree from PCL (which in turn uses flann) which finds the approximate,
not the exact nearest neighbor. The new GRASS-native k-d tree always
finds the real nearest neighbor.
- spatial cluster analysis: a point cloud can be partitioned into
separate clusters where points within each cluster are closer to each
other than to points of another cluster. To be implemented.
I suggest implementing DBSCAN: http://en.wikipedia.org/wiki/DBSCAN
It is density-based and thus produces more useful results than
centroid-based methods (k-means and relatives). It can actually
trace the shapes of spatial clusters (even concave ones) and does
not require a priori knowledge about the number of clusters. It
can also handle noise.
There is a reference implementation (albeit in Java) that you can
use to compare it with other clustering algorithms:
http://elki.dbs.ifi.lmu.de/
An implementation in C (that I have not tried) is here:
https://github.com/siddharth950/DBSCAN
The one drawback of DBSCAN is that it needs an efficient spatial
index to perform well -- and you have just taken care of that!
- point cloud thinning: a sample can be generated from a large point
cloud by specifying a minimum distance between sample points. To be
implemented.
The new k-d tree is now used by v.clean tool=snap (Vect_snap_lines()),
reducing both memory consumption and processing time.
I would also like to point out that SAGA GIS is moving into
the same direction, i.e. more efficient processing of very
large point clouds. The latest release includes a number
of new point cloud tools. Perhaps it's worth a look.
Most importantly, SAGA GIS has introduced a new file format,
the SAGA Pointcloud (.spc) format. It is compact and yet
flexible enough for a variety of purposes. I recommend to
consider implementing import/export of this format in GRASS 7,
preferably not via v.in.ogr, to avoid the OGR model conversion
overhead.
If you think this would be an interesting option, then you can
find a summary on our tracker:
http://gvsigce.sourceforge.net/mantis/view.php?id=595
(we are going to implement ".spc" in gvSIG CE, as well).
Best wishes,
Ben
More technical:
the new k-d tree finds the exact nearest neighbor(s), not some
approximation. It supports up to 255 dimensions. It is dynamic, i.e.
points can be inserted and removed at any time. It is balanced to
improve search performance. It provides k nearest neighbor search
(find k neighbors to a given coordinate) as well as radius or distance
search (find all neighbors within radius, i.e. not farther away than
radius to a given coordinate).
Markus M
_______________________________________________
grass-dev mailing list
grass-dev@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/grass-dev
--
Dr. Benjamin Ducke
{*} Geospatial Consultant
{*} GIS Developer
Spatial technology for the masses, not the classes:
experience free and open source GIS at http://gvsigce.org