[GRASS-dev] [GRASS GIS] #2131: Terrible performance from v.what.rast due to per-iteration db_execute

#2131: Terrible performance from v.what.rast due to per-iteration db_execute
-------------------------------------+--------------------------------------
Reporter: hamish | Owner: grass-dev@…
     Type: defect | Status: new
Priority: major | Milestone: 6.4.4
Component: Database | Version: svn-develbranch6
Keywords: v.what.rast, db_execute | Platform: Linux
      Cpu: x86-64 |
-------------------------------------+--------------------------------------
Hi,

I'm running v.what.rast for 175k query points in 6.x. It's taking a
horribly long time.
With debug at level 1 it shows that it gets done with the query processing
and
on to the "Updating db table" stage in less than 1 second. Over an *hour
later* I'm still waiting for the dbf process, which is running at 99% cpu!
This
is a fast workstation too.

v.out.ascii's columns= option was suffering the same trouble last time I
tried,
to the point where it becomes unusable with more than ~ 10k vector points.

The v.colors, v.in.garmin, and v.in.gpsbabel scripts /used to/ suffer from
the same
thing, but we sped that up by writing all the sql commands to a temp file
and
then just running db.execute once. It seems that opening and closing the
database has non-trivial overhead associated with it, and when you do that
for
every single cat it adds up in a pretty impressive way. Even if another DB
backend is faster to start+write+stop, I doubt it would be more than ~20%
different, max. It seems 100k points takes much much longer than just 10x
the time for a 10k point vector map.

demo:
{{{
g.region rast=elevation
v.random out=test_100k_pts n=100000
v.db.addtable test_100k_pts column='cat integer, elev double' #gets slow
too!
time v.what.rast vect=test_100k_pts rast=elevation column=elev
}}}

My current workaround is to add a flag to v.what.rast to optionally print
the
result to stdout instead of writing it to a db column. (done locally, I'm
still
testing some other interpolation improvements so haven't committed
anything yet)
With that -p flag, the module takes 0.5 seconds to complete when stdout is
redirected to /dev/null.

any thoughts on the idea to write the sql commands to a to tempfile or
pipe,
then run db_execute_immediate() just once for all of them?

(maybe the per-iteration bsearch() in the loop is inefficient too, but
`top`
shows that 'dbf' is the thing eating all the cpu time)

in trunk it takes about 6 seconds to complete the 100k random points, I'm
not seeing anything obvious in the module changelog, so I guess something
in the libraries got fixed? any hints?

thanks,
Hamish

--
Ticket URL: <https://trac.osgeo.org/grass/ticket/2131&gt;
GRASS GIS <http://grass.osgeo.org>

#2131: Terrible performance from v.what.rast due to per-iteration db_execute
-------------------------------------+--------------------------------------
Reporter: hamish | Owner: grass-dev@…
     Type: defect | Status: new
Priority: major | Milestone: 6.4.4
Component: Database | Version: svn-develbranch6
Keywords: v.what.rast, db_execute | Platform: Linux
      Cpu: x86-64 |
-------------------------------------+--------------------------------------

Comment(by hamish):

Replying to [ticket:2131 hamish]:
> I'm running v.what.rast for 175k query points in 6.x. It's taking a
horribly long time.
> With debug at level 1 it shows that it gets done with the query
processing and
> on to the "Updating db table" stage in less than 1 second. Over an *hour
> later* I'm still waiting for the dbf process, which is running at 99%
cpu! This
> is a fast workstation too.
...
> in trunk it takes about 6 seconds to complete the 100k random
> points, I'm not seeing anything obvious in the module changelog,
> so I guess something in the libraries got fixed? any hints?

actually trunk is pretty bad too. v.db.addtable takes a couple of minutes
to read the categories (the first half is reasonably quick, but then it
slows down more and more), and then v.what.rast after running for 13
minutes is only 9% done. (with the new print instead of update DB flag it
takes only 0.347s to complete)

any ideas?

thanks,
Hamish

--
Ticket URL: <https://trac.osgeo.org/grass/ticket/2131#comment:1&gt;
GRASS GIS <http://grass.osgeo.org>

#2131: Terrible performance from v.what.rast due to per-iteration db_execute
----------------------------------------------+-----------------------------
Reporter: hamish | Owner: grass-dev@…
     Type: defect | Status: new
Priority: major | Milestone: 6.4.4
Component: Database | Version: svn-develbranch6
Keywords: v.what.rast, db_execute, v.to.db | Platform: Linux
      Cpu: x86-64 |
----------------------------------------------+-----------------------------
Changes (by neteler):

  * keywords: v.what.rast, db_execute => v.what.rast, db_execute, v.to.db

Comment:

The slow part appears to be v.to.db, so that needs to be optimized.

--
Ticket URL: <http://trac.osgeo.org/grass/ticket/2131#comment:2&gt;
GRASS GIS <http://grass.osgeo.org>