[Geoserver-devel] Feature Versioning

Thanks for your replies - lots of good thoughts in there.

I think we're concentrating a little bit too much on feature versioning
here; what I was proposing was supposed to be much more generic than
that. I was just replying to a message someone had sent me on
versioning - its really about extending the functionality of OGC
services (mostly WFS) by using WFS services.

I'm not at all fixed on any particular technology like XSLT - I only
mentioned it because it seems to fit well and everyone I've talked to
who has used it really like it. I've had to use it in a project - i
didn't even know it was XSLT and it only took me a few minutes to
figure out what it was doing. I would agree that it was a bit more
difficult to debug, but I think that was because the program was just
eating the exceptions and never reporting them.

I think most of the things people will be using it for would be pretty
simple and could code something up in no time - even if they don't
really know java or geotools. Thats a big set of developers who could
potentially contribute.

I'm sure many people might rather use the Geoserver WFS/WMS request
parser and directly manipulate the java objects it spits out. Thats
fine too.

I'm just trying to make something that will leverage a bunch of other
stuff. Its much easier to just stick these type of things at the
datastore level - and its has some advantages (like being able to
leverage SDE's versioning system for the SDE datastore). I'm also
trying to convince people to use the OGC interfaces instead of directly
connecting to their underlying systems.

----------------------------------

Jody mentioned implementing the spec, but I didn't see any real
reference to it. Does anyone know where an actual spec is?

WFS - page 38:

The featureVersion attribute is included in order to accommodate

systems that

support feature versioning. A value of ALL indicates that all versions

of a

feature should be fetched. Otherwise, an integer, n, can be specified

to

return the nth version of a feature. The version numbers start at 1,

which is

the oldest version. If a version value larger than the largest version

number

is specified, then the latest version is returned. The default action

shall be

for the query to return the latest version. Systems that do not support
versioning can ignore the parameter and return the only version that

they have.

This is a pretty poor interface to versions - especially since there
isn't any way to know what version of a feature the system has except
to get them all, then note the highest version number. There's no time
reference either. It seems extremely difficult and unwieldy. There's
no mention of it in Transaction or GetFeatureInfo or Lock.

There wasn't any reference to feature version in the GML 2.1.2 spec.

Could you give a reference to the spec you were talking about? I'd like
to read it and see how they're supposed to be implemented in WFS/WMS.
Does the WFS 1.1 spec talk about versioning yet?

----------------------------------

A few people had the idea of implementing feature versioning as a
"wrapper" on datastore so we could have individual datastores do a
better job (like the postgis Filter encoder vs the shapefile filter
encoder). The wrapper on the WFS datastore could be re-used by the
geocollaborator to implement versioning.

I'm not going to say that this is a bad idea, since it has some nice
features (no pun intended). The biggest is that it would be more
"accessible" to the base geotools core - meaning that people could just
use a versioning postgis datastore instead of trying to do the
versioning themselves.

Versioning in DB system can probably be done with triggers and rules, or
using the DB's built-in versioning system (if present).

Specifically, for the WFS datastore; if an auxiliary table (FeatureType)
is being used to store versions, then you'll want to have this table
accessible via WFS instead of storing it locally.

The actual geotools datastore API could be quite difficult to actually
control - its likely to be more difficult to implement than
Transactions. The WFS request/response model makes the problem
significantly less difficult.

I'm also trying to actively promote people to use WFS instead of
directly accessing their data. It just makes good architecture sense.

On the other hand, if there's already a versioning WFS datastore, the
"geocollaborator" versioning system should be pretty easy to implement!
That would mean we'd have lots of time to spend on the core and having
lots of example xforms.

Ideally I'd like to see versioning directly in the 'geocollaborator'
since it would make a good example and people would actually be able to
modify it to suit their needs. If its in geotools, we're significantly
reducing the number of people who could potentially help.

-----------------------------------------------------

If we do get a really nice WFS datastore, we should consider making it
available via a .dll/.so using compiled java (see the 'Spatial DB in a
Box' for Postgresql which compiled JTS to a .so). This would allow
everyone (especially the "C/C++" folks) to easily talk to WFS and parse
the results. I expect the .so to be huge, though...

Its, unfortunately, probably a bit more difficult than you'd think..

------------------------------------------------------------------

Here's an example of the simplest versioning system that I can think of
(I'm just writing this off the top of my head, I haven't put much
thought into it):

a) Add 2 integer attributes to any FeatureType
       not_valid_before and not_valid_after
   These will have an integer version of time (ie. milliseconds since
1970 or what have you).

b) update all the existing features so that they have the
not_valid_before as some very distant time in the future (ie "0"), and
the non_valid_after is NULL.

c) if its in a database, you can create a view thats basically:
    SELECT * FROM <table> WHERE not_valid_after ISNULL;

    This will give you a table with the most-up-to-date version.

We're ready to version this dataset:

a) (WFS read request - most-up-to-date)
     xform the Filter so its:
      <And>
         ... original query filter
         <PropertyIsNotNull>
             <PropertyName>not_valid_after</PropertyName>
         </PropertyIsNotNull>
      </And>

      Optional: if they request "all FeatureType attributes" to be
returned in the request, we can just replace it with a sub-list of
attribute names w/o the 2 "versioning" attributes.

b) (WFS read request - time-stamp)
     xform the Filter so its:
   <And>
      ... original query filter
      <And>
        <PropertyIsGreaterOrEqualTo>
     <PropertyName>not_valid_before</PropertyName>
     <Literal>...time in milliseconds</Literal>
        </PropertyIsGreaterOrEqualTo>
        <PropertyIsLessOrEqualThan>
     <PropertyName>not_valid_after</PropertyName>
     <Literal>...time in milliseconds</Literal>
        </PropertyIsLessOrEqualTo>
      </And>
   </And>

         Optional: if they request "all FeatureType attributes" to be
returned in the request, we can just replace it with a sub-list of
attribute names w/o the 2 "versioning" attributes.

c) (WFS Insert)
      Add the not_valid_before (current time in milliseconds-1) and
not_valid_after (null) to the feature being inserted.

d) (WFS Delete)
      convert it to an update, set not_valid_after to the current time
in milliseconds.

e) (WFS Update) this is the tricky one.
      1. grab the current version of the feature
      2. do this in a single transaction
           a) set not_valid_after to the current time in milliseconds
for the feature with not_valid_after thats currently NULL.
           b) insert a new feature with not_valid_after NULL and
not_valid_before the current time (in milliseconds-1)

NOTE: I'm using (milliseconds-1) so that each feature has a
non-overlapping valid timeframe (you could also do this with ">" vs
"<="). Only one write transaction can occur at a time. Each write
transaction has to take at least 1 millisecond.

The biggest problem in this is (1) you have to actually modify the base
table and (2) FID information needs work. There's a few other problems
too...

This is pretty simple (and effective), but there are much better ways of
doing it.

I also hope that you can see that XSLT would make this trivial (except
for, perhaps, (e)).

NOTE: adding the "who" component is a left as a simple exercise for the
reader.

dave

----------------------------------------------------------
This mail sent through IMP: https://webmail.limegroup.com/