[GRASS-user] v.lidar.edgedetection very slow

Michael wrote:

I actually get the same thing.
For the first while the process runs in full use
of one core, but once the process starts writing results to
the db, the whole process bogs down to a
grinding halt. CPU usage for v.lidar.edgedetection
drops down to ~1% (1% of one core) and sqlite usage rises to
~16%. Maybe I'm wrong, but my impression was that the
bottle neck was the modifications to the database.

hmmm. I've seen no problems on Linux64 + DBF backend. 100% core until
completion.

Are you running WinXP + SQlite? Can you try with the DBF driver?
maybe the common problem is in the SQLite driver... ??

if you build from the latest grass 6.5 or 7 svn you can use the --verbose
flag to follow the action. setting 'g.gisenv set="DEBUG=1" gives you
even more detail (even with existing versions).

John wrote:

Also, when I have tried to run v.oultier without db/topology
it says it cannot as there is no database.

you can have a database without topology.

I will look at this again later. Maybe I am not selecting 3D when
I have tried without db etc.

it needs that.

Mike:

Note that there is a bug in v.outlier. v.outlier
creates a temporary table by the name of outfilename_aux (LN
#137 /lidar/v.outlier/main.c) but elsewhere in the code it
is hardwired to Auxiliar_outlier_table (LN's # 239
& 258 /lidar/v.outlier/outlier.c) , so when it
actually tries to write to the table it complains and fails.
You can get around it by manually creating this table using;
db.copy from_table=outfilename_aux
to_table=Auxiliar_outlier_table
I'm not sure why, but it looks like the
system doesn't always need to use the Auxiliar_outlier_table, and
when it doesn't, it continues on merrily without
complaint. Unfortunately, when it does require the table, on my system
the program continues on only to segmentation fault further
down the line (at least it does on my system). Maybe
it's because of the changes that I made to the code to
correct the above error, but I can't see why that would
be true. I was planning on looking further into it, but
haven't had the time to start debugging. In the mean
time I've added the files with my changes to the code
for the developers. I'm a hack, so the changes are
probably in poor coding practice. Someone
more savvy that I will need to clean it

....

PS. IF I find what I think are errors or
possible improvements to the code, what is the most
appropriate place and method to submit them?

please create your patches with "svn diff > somepatch.diff" and file them
in the trac system. Otherwise they quickly get lost and forgotten.

http://grass.osgeo.org/bugtracking/
https://trac.osgeo.org/grass/wiki/WikiStart#BugTracking

cheers,
Hamish

On 19-Jun-09, at 8:54 PM, Hamish wrote:

Michael wrote:

I actually get the same thing.
For the first while the process runs in full use
of one core, but once the process starts writing results to
the db, the whole process bogs down to a
grinding halt. CPU usage for v.lidar.edgedetection
drops down to ~1% (1% of one core) and sqlite usage rises to
~16%. Maybe I'm wrong, but my impression was that the
bottle neck was the modifications to the database.

hmmm. I've seen no problems on Linux64 + DBF backend. 100% core until
completion.

Are you running WinXP + SQlite? Can you try with the DBF driver?
maybe the common problem is in the SQLite driver... ??

I'm running Mac OS X + SQLite with a 64 bit version of grass. I ran a whole bunch of tests today and it looks like there are at least a couple of things going on and SQLite is one of the issues. I ran a 500m x500m tile with three different DB back-ends (SQLite, Postgres and DBF) and these are the results I obtained;
(SQLite)
time v.lidar.edgedetection input=Cal_QTile output=Cal_QTile_edges
real 38m53.458s
user 1m17.602s
sys 4m6.353s

(Postgres)
time v.lidar.edgedetection input=Cal_QTile output=Cal_QTile_edges
real 6m49.060s
user 0m46.622s
sys 1m20.324s

(DBF)
time v.lidar.edgedetection input=Cal_QTile output=Cal_QTile_edges
real 1m54.065s
user 0m48.530s
sys 1m8.686s

The results with Postgres and SQLite were a real surprise to me.

if you build from the latest grass 6.5 or 7 svn you can use the --verbose
flag to follow the action. setting 'g.gisenv set="DEBUG=1" gives you
even more detail (even with existing versions).

Does this print out the debugging messages that are in the code or do I need to do something else to see those?

please create your patches with "svn diff > somepatch.diff" and file them
in the trac system. Otherwise they quickly get lost and forgotten.

http://grass.osgeo.org/bugtracking/
https://trac.osgeo.org/grass/wiki/WikiStart#BugTracking

OK, thanks. I'll post a bug report and submit a patch soon.

Cheers,

Mike

On 22/06/09 07:00, Michael Perdue wrote:

On 19-Jun-09, at 8:54 PM, Hamish wrote:

Michael wrote:

I actually get the same thing.
For the first while the process runs in full use
of one core, but once the process starts writing results to
the db, the whole process bogs down to a
grinding halt. CPU usage for v.lidar.edgedetection
drops down to ~1% (1% of one core) and sqlite usage rises to
~16%. Maybe I'm wrong, but my impression was that the
bottle neck was the modifications to the database.

hmmm. I've seen no problems on Linux64 + DBF backend. 100% core until
completion.

Are you running WinXP + SQlite? Can you try with the DBF driver?
maybe the common problem is in the SQLite driver... ??

I'm running Mac OS X + SQLite with a 64 bit version of grass. I ran a whole bunch of tests today and it looks like there are at least a couple of things going on and SQLite is one of the issues. I ran a 500m x500m tile with three different DB back-ends (SQLite, Postgres and DBF) and these are the results I obtained;
(SQLite)
time v.lidar.edgedetection input=Cal_QTile output=Cal_QTile_edges
real 38m53.458s
user 1m17.602s
sys 4m6.353s

(Postgres)
time v.lidar.edgedetection input=Cal_QTile output=Cal_QTile_edges
real 6m49.060s
user 0m46.622s
sys 1m20.324s

(DBF)
time v.lidar.edgedetection input=Cal_QTile output=Cal_QTile_edges
real 1m54.065s
user 0m48.530s
sys 1m8.686s

The results with Postgres and SQLite were a real surprise to me.

As Hamish already mentioned, the main bottleneck here is the connection to the database which is much slower for actual database systems than for simple file-based DBF. Looking at the code, I see lots of calls to the Insert(), InsertInterpolation() and UpDate() functions. At each call you are hit with the overhead of the database connection. I don't know the code at all, but it might be worth thinking about grouping these database calls, possibly by putting all the SQL statements into a temp file and executing that only once at the end. In scripts doing this has lead to significant speed gains.

Moritz