Hi,
I'd like to propose the addition of a new community module that
will add the support for limited support for server side joining by
means of filter functions.
The idea is to be able to support that commonly asked for use case
of "find all bus stops within x meters from this bank" or "find all the
maple trees in this land parcel" where the bank or the land parcel
are in another WFS layer and are identified by id or some other
attribute query.
We don't support that right now, and the client has to make two
separate requests. That gets ugly pretty quickly as the client has
to make two round trips and javascript clients cannot really handle
huge geometry (what if the area of interest is the coastline of
Norvegia for example?).
The idea of the module is to have filter function that do the queries,
so that everything happens server side, where there is much more
resources available to do the job:
<wfs:GetFeature ... service="WFS" version="1.0.0">
<wfs:Query typeName="busStops">
<ogc:Filter>
<ogc:DWithin>
<ogc:PropertyName>busStopGeom</ogc:PropertyName>
<ogc:Function name="querySingle">
<ogc:Literal>banks</ogc:Literal> <!-- the layer -->
<ogc:Literal>bankGeom</ogc:Literal> <!-- the geometry attribute -->
<ogc:Literal>BANK_ID = 'abcde'</ogc:Literal> <!-- CQL filter -->
</ogc:Function>
<ogc:Distance units="meter">1000</ogc:Distance>
</ogc:DWithin>
</ogc:Filter>
</wfs:Query>
</wfs:GetFeature>
The module would add this and other functions that allow
to extract an attribute value from a layer in GeoServer, effectively
making the two queries right in the server.
A version that picks multiple features would look like:
<wfs:GetFeature ... service="WFS" version="1.0.0">
<wfs:Query typeName="busStops">
<ogc:Filter>
<ogc:DWithin>
<ogc:PropertyName>busStopGeom</ogc:PropertyName>
<ogc:Function name="collectGeometries"> <!-- collapse list to
multigeom -->
<ogc:Function name="queryCollection">
<ogc:Literal>banks</ogc:Literal> <!-- the layer -->
<ogc:Literal>bankGeom</ogc:Literal> <!-- the geometry -->
<ogc:Literal>BANK_NAME = 'Metro'</ogc:Literal> <!-- CQL -->
</ogc:Function>
</ogc:Function>
<ogc:Distance units="meter">1000</ogc:Distance>
</ogc:DWithin>
</ogc:Filter>
</wfs:Query>
</wfs:GetFeature>
Two filter functions, the first returns a list of values, the
second would summarize them into a single geometry for
DWithin to use.
Now of course the second function might get dangerous, so there
would be a configurable limit to the amount of records returned and
their size:
- if queryCollection returns more than x records, boom, service exception
- if collect geometries ends up with a too large one (as counted by
the number of ordinates) boom again
In order to make this work efficiently we'll also need another
bit in geotools: constant function elision.
With the current implementation a function in the filter is evaluated
in memory for each returned features, which is extremely inefficient.
I want to have a way to mark a function so that if it's not using
any feature attribute we can assume its result is going to be a
constant, and thus it can be optimized out and replaced with a literal
by evaluating it just once.
Opinions?
Cheers
Andrea
PS: I know the "right" thing would be to actually support joins, but
also believe everybody understands this one I'm proposing can be
done in days and get some of the benefit with limits in how large
the join can be, whilst actual join support would take many weeks
of works and various changes in the gt2 api as well (Query and datastore
modifications, new datastores, native join support in databases and
the like).
--
Ing. Andrea Aime
Technical Lead
GeoSolutions S.A.S.
Via Poggio alle Viti 1187
55054 Massarosa (LU)
Italy
phone: +39 0584962313
fax: +39 0584962313
http://www.geo-solutions.it
http://geo-solutions.blogspot.com/
http://www.linkedin.com/in/andreaaime
http://twitter.com/geowolf
-----------------------------------------------------