[GRASS-dev] [GRASS GIS] #3697: sqlite db: automatically enable vacuum

#3697: sqlite db: automatically enable vacuum
---------------------------+-------------------------
Reporter: mlennert | Owner: grass-dev@…
     Type: enhancement | Status: new
Priority: normal | Milestone: 7.8.0
Component: Database | Version: unspecified
Keywords: sqlite vacuum | CPU: Unspecified
Platform: Unspecified |
---------------------------+-------------------------
I just noticed that a mapset sqlite db file was huge, despite the fact
that there was only one small vector file left in it. Running 'vacuum;' in
the sqlite database reduced the announced file size from 47G to 78K !

There is a pragma auto_vacuum which allows to enable an auto-vacuum mode,
which is not as efficient as calling the vacuum command, as it does not
defragment the database, but I would think that it would at least keep
file size to expected levels.

I would, therefore, plead for enabling this pragma by default at database
creation and setting it to 1 (full). I think this would be the behavior
expected by most users.

--
Ticket URL: <https://trac.osgeo.org/grass/ticket/3697&gt;
GRASS GIS <https://grass.osgeo.org>

#3697: sqlite db: automatically enable vacuum
--------------------------+---------------------------
  Reporter: mlennert | Owner: grass-dev@…
      Type: enhancement | Status: new
  Priority: normal | Milestone: 7.8.0
Component: Database | Version: unspecified
Resolution: | Keywords: sqlite vacuum
       CPU: Unspecified | Platform: Unspecified
--------------------------+---------------------------

Comment (by mmetz):

Replying to [ticket:3697 mlennert]:
> I just noticed that a mapset sqlite db file was huge, despite the fact
that there was only one small vector file left in it. Running 'vacuum;' in
the sqlite database reduced the announced file size from 47G to 78K !
>
> There is a pragma auto_vacuum which allows to enable an auto-vacuum
mode, which is not as efficient as calling the vacuum command, as it does
not defragment the database, but I would think that it would at least keep
file size to expected levels.
>
> I would, therefore, plead for enabling this pragma by default at
database creation and setting it to 1 (full). I think this would be the
behavior expected by most users.

This auto_vacuum would occur at every transaction commit. Maybe it is more
efficient to VACUUM the sqlite db now and then, e.g. when the database is
closed or when the GRASS session is terminated.

VACUUM when the database is closed would still be quite often, e.g. with
every SELECT statement or also just describing the table.

I would prefer to VACUUM the sqlite db when the GRASS session is
terminating, i.e. in lib/init/grass.py. This would not slow down standard
db operations and leaves the db in a clean state.

--
Ticket URL: <https://trac.osgeo.org/grass/ticket/3697#comment:1&gt;
GRASS GIS <https://grass.osgeo.org>

#3697: sqlite db: automatically enable vacuum
--------------------------+---------------------------
  Reporter: mlennert | Owner: grass-dev@…
      Type: enhancement | Status: new
  Priority: normal | Milestone: 7.8.0
Component: Database | Version: unspecified
Resolution: | Keywords: sqlite vacuum
       CPU: Unspecified | Platform: Unspecified
--------------------------+---------------------------
Changes (by mmetz):

* Attachment "sqlite_vacuum_atexit.patch" added.

VACUUM sqlite db when GRASS is terminating

--
Ticket URL: <https://trac.osgeo.org/grass/ticket/3697&gt;
GRASS GIS <https://grass.osgeo.org>

#3697: sqlite db: automatically enable vacuum
--------------------------+---------------------------
  Reporter: mlennert | Owner: grass-dev@…
      Type: enhancement | Status: new
  Priority: normal | Milestone: 7.8.0
Component: Database | Version: unspecified
Resolution: | Keywords: sqlite vacuum
       CPU: Unspecified | Platform: Unspecified
--------------------------+---------------------------

Comment (by neteler):

VACCUMing is a great improvement!

Replying to [comment:1 mmetz]:
> I would prefer to VACUUM the sqlite db when the GRASS session is
terminating, i.e. in lib/init/grass.py. This would not slow down standard
db operations and leaves the db in a clean state.

I would like that idea as well to add that during session termination.

Perhaps likewise also for PostgreSQL and mySQL/MariaDB?

--
Ticket URL: <https://trac.osgeo.org/grass/ticket/3697#comment:2&gt;
GRASS GIS <https://grass.osgeo.org>

#3697: sqlite db: automatically enable vacuum
--------------------------+---------------------------
  Reporter: mlennert | Owner: grass-dev@…
      Type: enhancement | Status: new
  Priority: normal | Milestone: 7.8.0
Component: Database | Version: unspecified
Resolution: | Keywords: sqlite vacuum
       CPU: Unspecified | Platform: Unspecified
--------------------------+---------------------------

Comment (by neteler):

Kitchen talk: grass.core.start_command() might be even more fun launched
upon session termination :slight_smile:

--
Ticket URL: <https://trac.osgeo.org/grass/ticket/3697#comment:3&gt;
GRASS GIS <https://grass.osgeo.org>

#3697: sqlite db: automatically enable vacuum
--------------------------+---------------------------
  Reporter: mlennert | Owner: grass-dev@…
      Type: enhancement | Status: new
  Priority: normal | Milestone: 7.8.0
Component: Database | Version: unspecified
Resolution: | Keywords: sqlite vacuum
       CPU: Unspecified | Platform: Unspecified
--------------------------+---------------------------

Comment (by mlennert):

Replying to [comment:1 mmetz]:
> I would prefer to VACUUM the sqlite db when the GRASS session is
terminating, i.e. in lib/init/grass.py. This would not slow down standard
db operations and leaves the db in a clean state.

+1

You are absolutely right, of course :wink:

For me this can be committed to trunk. We'll see then if it has any
negative consequences.

--
Ticket URL: <https://trac.osgeo.org/grass/ticket/3697#comment:4&gt;
GRASS GIS <https://grass.osgeo.org>

#3697: sqlite db: automatically enable vacuum
--------------------------+---------------------------
  Reporter: mlennert | Owner: grass-dev@…
      Type: enhancement | Status: new
  Priority: normal | Milestone: 7.8.0
Component: Database | Version: unspecified
Resolution: | Keywords: sqlite vacuum
       CPU: Unspecified | Platform: Unspecified
--------------------------+---------------------------

Comment (by mmetz):

In [changeset:"73692" 73692]:
{{{
#!CommitTicketReference repository="" revision="73692"
lib/init: clean up sqlite db if existing, see #3697
}}}

--
Ticket URL: <https://trac.osgeo.org/grass/ticket/3697#comment:5&gt;
GRASS GIS <https://grass.osgeo.org>

#3697: sqlite db: automatically enable vacuum
--------------------------+---------------------------
  Reporter: mlennert | Owner: grass-dev@…
      Type: enhancement | Status: new
  Priority: normal | Milestone: 7.8.0
Component: Database | Version: unspecified
Resolution: | Keywords: sqlite vacuum
       CPU: Unspecified | Platform: Unspecified
--------------------------+---------------------------

Comment (by mmetz):

Replying to [comment:4 mlennert]:
> Replying to [comment:1 mmetz]:
> > I would prefer to VACUUM the sqlite db when the GRASS session is
terminating, i.e. in lib/init/grass.py. This would not slow down standard
db operations and leaves the db in a clean state.

Added benefit: already existing databases are also cleaned, while with
PRAGMA auto_vacuum=1 only newly created databases would be cleaned.

--
Ticket URL: <https://trac.osgeo.org/grass/ticket/3697#comment:6&gt;
GRASS GIS <https://grass.osgeo.org>

#3697: sqlite db: automatically enable vacuum
--------------------------+---------------------------
  Reporter: mlennert | Owner: grass-dev@…
      Type: enhancement | Status: closed
  Priority: normal | Milestone: 7.8.0
Component: Database | Version: unspecified
Resolution: fixed | Keywords: sqlite vacuum
       CPU: Unspecified | Platform: Unspecified
--------------------------+---------------------------
Changes (by mlennert):

* status: new => closed
* resolution: => fixed

Comment:

This seems to be solved, so closing.

Thanks, MarkusM !

--
Ticket URL: <https://trac.osgeo.org/grass/ticket/3697#comment:7&gt;
GRASS GIS <https://grass.osgeo.org>

#3697: sqlite db: automatically enable vacuum
--------------------------+---------------------------
  Reporter: mlennert | Owner: grass-dev@…
      Type: enhancement | Status: reopened
  Priority: normal | Milestone: 7.8.0
Component: Database | Version: unspecified
Resolution: | Keywords: sqlite vacuum
       CPU: Unspecified | Platform: Unspecified
--------------------------+---------------------------
Changes (by neteler):

* status: closed => reopened
* resolution: fixed =>

Comment:

I'd like to see this in 7.6 as well... Any objections?

--
Ticket URL: <https://trac.osgeo.org/grass/ticket/3697#comment:8&gt;
GRASS GIS <https://grass.osgeo.org>

#3697: sqlite db: automatically enable vacuum
--------------------------+---------------------------
  Reporter: mlennert | Owner: grass-dev@…
      Type: enhancement | Status: reopened
  Priority: normal | Milestone: 7.8.0
Component: Database | Version: unspecified
Resolution: | Keywords: sqlite vacuum
       CPU: Unspecified | Platform: Unspecified
--------------------------+---------------------------

Comment (by mlennert):

Replying to [comment:8 neteler]:
> I'd like to see this in 7.6 as well... Any objections?

You're right, that would be nice. I don't know if the solution implemented
in trunk is transferable as is to 7.6 ?

--
Ticket URL: <https://trac.osgeo.org/grass/ticket/3697#comment:9&gt;
GRASS GIS <https://grass.osgeo.org>

#3697: sqlite db: automatically enable vacuum
--------------------------+---------------------------
  Reporter: mlennert | Owner: grass-dev@…
      Type: enhancement | Status: reopened
  Priority: normal | Milestone: 7.8.0
Component: Database | Version: unspecified
Resolution: | Keywords: sqlite vacuum
       CPU: Unspecified | Platform: Unspecified
--------------------------+---------------------------

Comment (by neteler):

In [changeset:"73736" 73736]:
{{{
#!CommitTicketReference repository="" revision="73736"
sqlite db: automatically enable vacuum, trac #3697 (bundle backport: trunk
r73692, r73703, r73704, r73706, r73707, r73708)

  * lib/init: clean up sqlite db if existing, see #3697
  * libpython: move clean_default_db() to setup
  * libinit: use clean_default_db() from script.setup
  * libpython: add functions to finish a GRASS session
  * libinit: start rewriting to use functions in lib/python/scripts
  * libinit: fix r73707, clean_all() takes no arguments
}}}

--
Ticket URL: <https://trac.osgeo.org/grass/ticket/3697#comment:10&gt;
GRASS GIS <https://grass.osgeo.org>

#3697: sqlite db: automatically enable vacuum
--------------------------+---------------------------------
  Reporter: mlennert | Owner: grass-dev@…
      Type: enhancement | Status: reopened
  Priority: normal | Milestone: 7.6.0
Component: Database | Version: svn-releasebranch76
Resolution: | Keywords: sqlite vacuum
       CPU: Unspecified | Platform: Unspecified
--------------------------+---------------------------------
Changes (by neteler):

* version: unspecified => svn-releasebranch76
* milestone: 7.8.0 => 7.6.0

Comment:

Should be all backported (see above), pls test.

--
Ticket URL: <https://trac.osgeo.org/grass/ticket/3697#comment:11&gt;
GRASS GIS <https://grass.osgeo.org>

#3697: sqlite db: automatically enable vacuum
--------------------------+---------------------------------
  Reporter: mlennert | Owner: grass-dev@…
      Type: enhancement | Status: closed
  Priority: normal | Milestone: 7.6.2
Component: Database | Version: svn-releasebranch76
Resolution: fixed | Keywords: sqlite vacuum
       CPU: Unspecified | Platform: Unspecified
--------------------------+---------------------------------
Changes (by wenzeslaus):

* status: reopened => closed
* resolution: => fixed

Comment:

This was committed to trunk as well as backported to 7.6 release branch at
that time, so closing this one as fixed.
--
Ticket URL: <https://trac.osgeo.org/grass/ticket/3697#comment:14&gt;
GRASS GIS <https://grass.osgeo.org>