For those interested:
Dear lists, This is an status update for the job done on the ComplexDataStore
project encouraged by Social Change Online.
The first phase of the project is almost complete, and a
working example is provided on the geoserver's complex-features branch.
The functional goal of this project's initial phase was to serve spatial data
through WFS, which is stored in an organization's internal RDBMS, conforming
to an externally defined schema, or "community schema".
A number of actions had to be made in order to seamlessly integrate this
functionality into the GeoServer product. Lets review them in a bottom-up
scenario:
1). Feature instances have to be served through a non spatial RDBMS, where
the spatial attribute is constructed from a pair of table fields holding X
and Y ordinates.
2). For technical and/or business reasons, the dataset exposed has to be
derived from an SQL query, which provides the full power of the SQL language
for creating a "view" of the dataset that better represents the featureset to
serve, and allows native RDBMS optimizations.
3). Input queries against the FeatureType derived
from an SQL query must be translated to the correct backend SQL query.
4). As the output schema differs from the input one, it must be possible to
define the attribute mappings from the input FeatureType to the output
FeatureType, which involves not only direct mappings or "aliases", but the
ability to derive an output schema property from a combination of input
schema properties.
5). As the output schema defines a complex FeatureType, "complex" meaning
that attributes may have multiplicity other than 0..1 or 1..0, and they may
have nested properties of any level of deepness, the GeoTools feature model
must support this kind of feature attributes.
6). Having support for complex attributes, and being the geotools restriction
model being based on OGC Filter 1.0 spec, the geotools Filter implementation
must be able of defining dataset restrictions using XPath attribute
expressions and correctly evaluate them.
7). For performance reasons, filters made against the complex output schema
must be "unrolled" to its equivalent construct against the input schema,
avoiding a full run-time evaluation and allowing the backend datastore to
optimize the query.
8). Finally, all this functionality must integrate seamlessly on the
geoserver WFS and WMS services, which means that the GML production of a
complex dataset must be possible and no code modifications should be needed
to the GeoServer codebase other than for fixing those cases where it is
assumed a flat FeatureType structure.
This are the changes implemented in accomplishing each of these specific
requirements:
- For 1). the geometryless datastore has been reviewed and extended to
support 2).
- For 2) and 3) a new subinterface of org.geotools.data.DataStore has been
defined, named SqlDataStore, which defines a method for registering feature
types from user defined SQL queries, and a series of utility classes was
developed to help any existing JDBCDataStore in the implementation of this
new interface.
- For 4) the complexDataStore plugin implements a FeatureTypeMapping class,
which acts as a placeholder for the definition of attribute and id mappings
between an input schema and an output one. This mappings are based on a set
of XPath location paths that addresses a target property, each one paired
with an org.geotools.filter.Expression, that defines how to contruct the
target property value from an input Feature, giving a lot of power of
flexibility to the attribute mapping. Also, a XML schema was defined to
support the persistence of the mappings and the output schema, since it
needs to be parsed before use, and since its defined externally, it could
not be acquired from an existent data source.
- For 5) and 6) the geotools Feature Model has been revised and updated,
generating a set of GeoAPI interfaces and an implementation with enough
functionality as needed by this project. In the hope of providing a workable
upgrade path from the old Feature model to the new one, old geotools
interfaces have beed deprecated and its implementations moved to the
implementation of the new interfaces, ensuring that all the pre-existing code
base still works as used (that is, no unit tests broken), and explicitly
using the new interfaces just where needed, like in the implementation of the
complexds plugin and the review of the Filter package to operate against
complex attributes.
- For 7) each time a query is made against the output schema, it is
transformed to its equivalent in the input schema by the use of a
FilterVisitor that basically maps attribute expressions to the mapping
expression defined in a FeatureTypeMapping, and returns a new Filter to
operate against the surrogate feature type.
-For 8), a new GeoServer branch was created from trunk, the geotools
FeatureTransformer has been updated to encode complex attributes, the new
jars are contained in this geoserver branch, and no code modifications were
needed to get it working other than some very trivial changes. Anyway, this
is what still needs a bit more of work and testing, since FeatureType
encoding to XML schema isn't still working, but WMS and GetFeature queries
seems to be working correctly.
There are 2 code repository branches holding the implementation of this
functionality, one for geotools and another for geoserver:
http://svn.geotools.org/geotools/branches/complex-features/
https://svn.codehaus.org/geoserver/branches/
if you're going to use the geotools branch, note that the new feature model
interfaces and implementation are in gt/module/main/opengis/src and
gt/module/main/opengis/test.
The above mentioned GeoServer branch has working examples of this stuff, you
should only need to edit file locations paths in the conf directory and set
up the test database.
To set up the test dabase on PostgreSQL use the script in
conf/data/featureTypes/wq_ir_results/create_wqdp.sql, and import the data in
the wq_ir_results.dat file.
To get them all running, adjust databse connection parameters and file
location paths in the geoserver catalog.xml found on conf/WEB-INF
If you want to see/test/change the attribute mapping definitions from input
to output schemas, edit the file wq_plus_mappings.xml or roadsegments.xml in
their respective feature type directories under conf/data.
As a final note, all this stuff seems to be working, except a few issues I am
going to fix in the next couple days, but at least it is ready for testing
out. Once you have geoserver running and verified that the wq_plus feature
type is exposed in the GetCapabilities document, you can try the provided
sample WFS requests that you'll find on the geoserver "demo requests" web
page, or open the map preview page to see them being served by the WMS.
As always, comments/suggestions are appreciated, and thanks to all for your
involvement and support, it was a pleasure working on this project, and a bit
of a pain too due to the immense amount of work that have to be done to get
it as right as we could. Fortunatelly, I think we have overfilled the initial
project goals, and there is place for further enhancements and contributions.
Gabriel.