I read through the one of the papers today (M. A. Brovelli , M. Cannata and U. M. Longoni, 2002. Managing and processing LIDAR data within GRASS, Proceedings of the Open source GIS - GRASS users conference 2002) and the authors recommended a step value or 3 to 4 time the average point spacing of the data set. So 4m sen & see values seems appropriate for a data set that has an average coverage of 1 return/m^2. Unfortunately, the authors didn’t really explain why it should be 3-4 times and not say 10. I can only guess that a smaller step size would produce finer details in the interpolation and better feature isolation for the classification.
There definitely seems to be something funky going on between the spline step size and the division of the region into subregion. I ran tests using v.lidar.edgedetection on tiles of 500x500m, 800x800m , 801x801m and 1000x1000m. Times for each run were 1m54.065s, 6m35.420s, 22m16.708s & 68m25.265s respectively. There was a ~3 fold increase in processing time when I grew the data set by 1m from (800m)^2 (what I understand to be the maximum size of the region before it is broken up into subregions) to (801m)^2 and it tripled again when I jumped to (1000m)^2. I also recompiled grass with NSPLX_MAX and NSPLY_MAX set to 1500 instead of the current 200 and the time to process a (1000m)^2 tile dropped from 68 minutes to 12 minutes.
My understanding is that a region is broken up into smaller subregions to avoid memory limitations while processing. I can’t see any reason why the division of the region into more smaller subregions should increase the resolution of those subregions. I’m far from an expert on these programs, and I’m not skilled enough to read through the code and know what it is doing, but if I understand you correctly, your arguments make sense to me. If you’re kind enough to share your changes I’ll apply them and do some more testing.
Cheers,
Mike
On 20-Jun-09, at 1:56 AM, Markus GRASS wrote:
Did some more testing. The see and sen options have drastic influence on
speed, with larger values it completes in a minute, with default values
it takes hours (lidar points in North Carolina sample dataset with 10m
resolution in the current computational region).
If have not the faintest idea about the theory behind the edge detection
code, but something appears strange to me, it’s the effect of the sen
and see options. These options are interpolation spline step values
(option description), apparently something like step size. The current
computational region will be subdivided into several overlapping
subregions (source code comment). For each subregion, bilinar
interpolation, T Cholesky Decomposition, bicubic interpolation and
another T Cholesky Decomposition is done, then point classification.
What I don’t understand is why with smaller steps, i.e. more, smaller
subregions, the resolution of these subregions is increased and the
matrix becomes larger. That’s causing the incredibly long running time
of the module. It should be the other way around, larger step values
cause larger subregions and larger matrices not smaller matrices.
Changing the code in main.c and edgedetection.c accordingly gives very
similar results to the original version, only that the time required is
near constant for different spline step values and always very short,
nothing like hours, just a minute or two. It would be nice to have some
hints in the manual how to calculate reasonable values for a given lidar
point set and current computational region.
I’m not attaching any patches because I’m not sure of the original
behaviour is intended or if my arguments make sense. Haven’t read all
the references…
Markus M
grass-user mailing list
grass-user@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/grass-user