Hi folks...
have had a chance to review the Operations API ideas, and would like to engage with a few of the issues and concepts.
For a start, I have put initial notes on support for "Community Schemas" up at http://docs.codehaus.org/display/GEOTOOLS/Community%2BSchema%2BSupport%2Band%2BComplex%2BTypes
For the Operations API concept the following points strike me:
1) There is a Web Services Resource Framework - coming out of the GRID computing world - that would be the logical way to invoke such operations. The UK NERC data grid people
(Andrew Woolf in particular) have done some initial thinking about bridging GRID and OGC services. I believe they would have an interest and considerable insight into the requirements of such a capability.
2) Many of the operations suggested exploit the fundamental duality of FeatureCollections and Coverages.
3) Real world complex (normalised) data stores are often designed to support only certain query patterns. WFS needs a means to advertise supported patterns. This is somewhat similar to stroed procedures. We are looking at a plug-in query strategy, but the operations API is a useful abstraction - query and processing are both in frame
4) There are significantly useful cases where the "batch processing" model isnt required. In particular, returned gridded feature (biota sightings) counts is something we have been able to do quite cost-effectively on million-row postgres databases in real time.
5) Such operations throw up lots of issues re the semantics of the "information products" on offer - so it seems logical to first crack the "community schema" support issue - aka well-known semantics - before such tricky things like trying to determine a useful meta-language for arbitrary operations. (consider letting the community schema approach provide for simple, externally declared, information products, and the operations become a pluggable configuration issue.)
Rob Atkinson
cholmes@anonymised.com wrote:
So Jody and I got to talking about the ever elusive joins this evening,
and fairly confused ourselves, so he wanted me to throw the relevant
parts of our conversation up on this list.From my end it looks like GeoServer is going to drive some development
work on this. I'd like to introduce Rob Atkinson, from Social Change
Online. He's been on the GeoServer list for awhile, and he's got some
funding to do a few improvements. I think most of these are going to
need to take place in the geotools code base. He'll introduce exactly
what he wants to get done soon, with some good use cases, but basically
the brunt of the work is going to involve Joins.jodygarnett: As for the joins there are several games a foot.
cholmesny: Really? What else?
jodygarnett: The origional AttributeReader based approach, there is
hints of an operational API that woudl also need a similar construct.
cholmesny: what do you mean a 'similar construct' ?
jodygarnett: Apparently this is something a lot of GIS systems do so
paul keeps trying to get me to look at it with respect to uDig.
jodygarnett: The idea that two FeatureSources are used by the same
"opperation" to generate a single result.
jodygarnett: The most basic example is an attribute based match.
cholmesny: Can I then use that as a DataStore?
cholmesny: And define FeatureTypes from it?
jodygarnett: Yes you would pretty much have to - or at least write the
answer to a "Temporary" FeatureStore.
Should we set up a break out session on either fid-exp or opeprations
api?
cholmesny: Actually only the last is required...but we need a way to
define the featureTypes...
cholmesny: We definitely need a break out irc for the operations if
that's how we want to handle Joins.
jodygarnett: The "easy thing" to do is extend Query (so that it can join
two other queries). When everything is backed by the same Postgis we
can produce raw SQL.
cholmesny: Which are nice, because they can drive devopment - we've
always fallen short before since it's all been pie in the sky dreams.
jodygarnett: It is when things are backed by multiple DataStores that my
mind hurts (I dont know where to send the request).
jodygarnett: The right place to send it would be a "Catalog" that knew
both FeatureStores.
cholmesny: Well, I think you'd have a MultiDataStore, that took two or
more datastores to be constructed.
jodygarnett: (I think that is called a Catalog)
cholmesny: And then would define views - virtual feature stores - of the
contained attributes.
cholmesny: You know, I'm not actually sure I agree with your Catalog
semantic.
cholmesny: Because to me a catalog has a lot more to do with meta
information than actually getting the Data itself.
jodygarnett: I do want to split DataStore into two, to match the
GridCoverageExchangeAPI. Most methods would stay the same,
getTypeNames() woudl become a convience method for a MetaData query.
cholmesny: Like a getContent call i nthe ogc catalog spec gets the meta
data, not the real data.
jodygarnett: Understood, I have read those specs now. We get have of
them implemented for GCE and then port that over to DataStore.
cholmesny: Ok.
jodygarnett: I still need that horrible "Catalog" construct I produced
before - I may rename it to DataRepository - to allow for Lock
Managmenet across DataStores.
jodygarnett: Actually the same problem we have for Joining.
cholmesny: But yeah, this stuff is going to take a lot of thought - it's
definitely the next level for this stuff.
cholmesny: Yeah, DataRepository makes more sense to me.
jodygarnett: Locks and Joins are opperations that cross DataStore
boundaries.
cholmesny: right.
jodygarnett: Wow that was a lot of thought in one going, I better save
that and attach it to a Jira wish list.
cholmesny: Yeah, let me think about the operations stuff. Because I do
need the results to be a FeatureStore for all intents and purposes.
cholmesny: Need to be able to return DescribeFeatureType, and to submit
queries against it.
jodygarnett: Me too, but I need to punt them to disk. ANd much like the
JAI constructs it is nice to define the abstract (say View) and allow
the user to choose where to place the explicit (say FeatureStores)
entires in the chain.
cholmesny: I guess at a basic level you could have the results be a
FeatureCollection, and then make a DataStore/FeatureStore out of that.
jodygarnett: FeatureCollection is bad - memory bound. The JAI approach
allows content to be "pulled" through the chain as needed.
jodygarnett: But that is about where my mind gets off.
jodygarnett: We should take this online, can you cut and paste the
correct bits of this talk to the list.So the question is, can someone explain for real how an Operations API
would handle joins from different datastores? And how that fits in
with the other stuff we have for the Operations API:
http://geotools.codehaus.org/Operations+API ? And could someone
explain how we'd do joins with datastores and attribute readers (I may
attempt the latter). Rob, could you also forward on your use cases and
ideas to the geotools list? I think the use cases are key, as we've
talked about this and even attempted a few designs for it, but we've
never had anything that actually _uses_ them.Probably is best to reply to one of the jira issues that Jody just put
up, like http://jira.codehaus.org/browse/GEOT-175, so we can have this
thread online and archived on jira.Chris
----------------------------------------------------------
This mail sent through IMP: https://webmail.limegroup.com/