On Friday 08 May 2009, C Michael Barton wrote:
On 5/7/09 12:23 PM, "Dylan Beaudette" <debeaudette@ucdavis.edu> wrote:
On Thursday 07 May 2009, Moritz Lennert wrote:
> On 07/05/09 14:04, Hamish wrote:
> >>> todo: for sqlite decide if we want per-map DB files
> >>> like with DBF or single DB file per mapset.
> >
> > ...
> >
> >> I would vote for both approaches. User can decide what he
> >> prefers via environmental variable, e.g. GRASS_SQLITE_PER_MAPSET.
> >
> > I think it would have to be a g.gisenv variable, or perhaps stored in
> > the $MAPSET/VAR file.
> >
> > which way would be the default?
>
> I would plead for per mapset as this allows for more complete database
> approach (i.e. joins, views, etc).
>
> MoritzI also agree on this matter. I think that one thing that is making this
decision particularly difficult is that we are lacking a robust interchange
format for complex vector data. Dumping from GRASS vectors --> shapefiles
--> GRASS can sometimes lead to data loss. I wonder if further development
of the SQLite spatial datatypes would help address this problem.Dylan
Hi Michael and others,
Note that one database per vector file, as we now have with the more
limited dbf format, does not preclude joins, views, schemas, etc.
That may be true, but what mechanism is 'doing' the joining? I realize that
with the DBF interface all 'SQL-like' operations are somewhat of an
abomination, but with something like SQLite at least we know that a robust
system can deal with most of the SQL thrown at it. This is especially
important when trying to work with subqueries or LEFT joins. I am not sure
how these operations would be possible between different (SQLite) database
files. This could be due to my limited experience with SQLite...
The only
thing it does is prevent joins between tables actively linked to different
vectors. That is, if you have a vector file of towns and a vector file of
soil sampling points, you can't join tables from these two different vector
files without exporting a table from one and importing it into the other.
(We could probably make this a relatively painless procedure in GRASS).
This is more useful than it appears-- as it allows for arbitrary transfer of
attributes from one feature to another via a join.
I can imagine situations where one might want to do such joins between
different vectors, but question whether it is really all that common. In
many (perhaps most) cases, different vector files represent different kinds
of information such that their attribute tables cannot be joined in a
meaningful way. And are joins between different vector files sufficiently
common to warrant the loss of portability that happens by combining all
attribute tables from all vector files in a mapset into a single SQL
database?
I am not sure how common this is, but I can tell you that I do this type of
operation all of the time when working with soils and land-use information.
Sometimes the join condition is spatial, but more often it is based on
attributes and look-up tables.
One potential down-side to a single SQLite db file that I recently encountered
is locking-- I was unable to perform some actions on vector data when a
long-running vector operation was in progress. It seems to be related to how
locking of the attribute tables works-- I am not sure if it is possible to
use table-level locking rather than file-level locking.
Either way, I am excited to see the DBF backend replaced by something like
SQLite.
Cheers,
Dylan
Michael