[Geoserver-devel] Re: Joins, possibly using the operations API?

Hi folks...

have had a chance to review the Operations API ideas, and would like to engage with a few of the issues and concepts.

For a start, I have put initial notes on support for "Community Schemas" up at http://docs.codehaus.org/display/GEOTOOLS/Community%2BSchema%2BSupport%2Band%2BComplex%2BTypes

For the Operations API concept the following points strike me:
1) There is a Web Services Resource Framework - coming out of the GRID computing world - that would be the logical way to invoke such operations. The UK NERC data grid people
(Andrew Woolf in particular) have done some initial thinking about bridging GRID and OGC services. I believe they would have an interest and considerable insight into the requirements of such a capability.

2) Many of the operations suggested exploit the fundamental duality of FeatureCollections and Coverages.

3) Real world complex (normalised) data stores are often designed to support only certain query patterns. WFS needs a means to advertise supported patterns. This is somewhat similar to stroed procedures. We are looking at a plug-in query strategy, but the operations API is a useful abstraction - query and processing are both in frame

4) There are significantly useful cases where the "batch processing" model isnt required. In particular, returned gridded feature (biota sightings) counts is something we have been able to do quite cost-effectively on million-row postgres databases in real time.

5) Such operations throw up lots of issues re the semantics of the "information products" on offer - so it seems logical to first crack the "community schema" support issue - aka well-known semantics - before such tricky things like trying to determine a useful meta-language for arbitrary operations. (consider letting the community schema approach provide for simple, externally declared, information products, and the operations become a pluggable configuration issue.)

Rob Atkinson

cholmes@anonymised.com wrote:

So Jody and I got to talking about the ever elusive joins this evening,
and fairly confused ourselves, so he wanted me to throw the relevant
parts of our conversation up on this list.

From my end it looks like GeoServer is going to drive some development

work on this. I'd like to introduce Rob Atkinson, from Social Change
Online. He's been on the GeoServer list for awhile, and he's got some
funding to do a few improvements. I think most of these are going to
need to take place in the geotools code base. He'll introduce exactly
what he wants to get done soon, with some good use cases, but basically
the brunt of the work is going to involve Joins.

jodygarnett: As for the joins there are several games a foot.
cholmesny: Really? What else?
jodygarnett: The origional AttributeReader based approach, there is
hints of an operational API that woudl also need a similar construct.
cholmesny: what do you mean a 'similar construct' ?
jodygarnett: Apparently this is something a lot of GIS systems do so
paul keeps trying to get me to look at it with respect to uDig.
jodygarnett: The idea that two FeatureSources are used by the same
"opperation" to generate a single result.
jodygarnett: The most basic example is an attribute based match.
cholmesny: Can I then use that as a DataStore?
cholmesny: And define FeatureTypes from it?
jodygarnett: Yes you would pretty much have to - or at least write the
answer to a "Temporary" FeatureStore.
Should we set up a break out session on either fid-exp or opeprations
api?
cholmesny: Actually only the last is required...but we need a way to
define the featureTypes...
cholmesny: We definitely need a break out irc for the operations if
that's how we want to handle Joins.
jodygarnett: The "easy thing" to do is extend Query (so that it can join
two other queries). When everything is backed by the same Postgis we
can produce raw SQL.
cholmesny: Which are nice, because they can drive devopment - we've
always fallen short before since it's all been pie in the sky dreams.
jodygarnett: It is when things are backed by multiple DataStores that my
mind hurts (I dont know where to send the request).
jodygarnett: The right place to send it would be a "Catalog" that knew
both FeatureStores.
cholmesny: Well, I think you'd have a MultiDataStore, that took two or
more datastores to be constructed.
jodygarnett: (I think that is called a Catalog)
cholmesny: And then would define views - virtual feature stores - of the
contained attributes.
cholmesny: You know, I'm not actually sure I agree with your Catalog
semantic.
cholmesny: Because to me a catalog has a lot more to do with meta
information than actually getting the Data itself.
jodygarnett: I do want to split DataStore into two, to match the
GridCoverageExchangeAPI. Most methods would stay the same,
getTypeNames() woudl become a convience method for a MetaData query.
cholmesny: Like a getContent call i nthe ogc catalog spec gets the meta
data, not the real data.
jodygarnett: Understood, I have read those specs now. We get have of
them implemented for GCE and then port that over to DataStore.
cholmesny: Ok.
jodygarnett: I still need that horrible "Catalog" construct I produced
before - I may rename it to DataRepository - to allow for Lock
Managmenet across DataStores.
jodygarnett: Actually the same problem we have for Joining.
cholmesny: But yeah, this stuff is going to take a lot of thought - it's
definitely the next level for this stuff.
cholmesny: Yeah, DataRepository makes more sense to me.
jodygarnett: Locks and Joins are opperations that cross DataStore
boundaries.
cholmesny: right.
jodygarnett: Wow that was a lot of thought in one going, I better save
that and attach it to a Jira wish list.
cholmesny: Yeah, let me think about the operations stuff. Because I do
need the results to be a FeatureStore for all intents and purposes.
cholmesny: Need to be able to return DescribeFeatureType, and to submit
queries against it.
jodygarnett: Me too, but I need to punt them to disk. ANd much like the
JAI constructs it is nice to define the abstract (say View) and allow
the user to choose where to place the explicit (say FeatureStores)
entires in the chain.
cholmesny: I guess at a basic level you could have the results be a
FeatureCollection, and then make a DataStore/FeatureStore out of that.
jodygarnett: FeatureCollection is bad - memory bound. The JAI approach
allows content to be "pulled" through the chain as needed.
jodygarnett: But that is about where my mind gets off.
jodygarnett: We should take this online, can you cut and paste the
correct bits of this talk to the list.

So the question is, can someone explain for real how an Operations API
would handle joins from different datastores? And how that fits in
with the other stuff we have for the Operations API:
http://geotools.codehaus.org/Operations+API ? And could someone
explain how we'd do joins with datastores and attribute readers (I may
attempt the latter). Rob, could you also forward on your use cases and
ideas to the geotools list? I think the use cases are key, as we've
talked about this and even attempted a few designs for it, but we've
never had anything that actually _uses_ them.

Probably is best to reply to one of the jira issues that Jody just put
up, like http://jira.codehaus.org/browse/GEOT-175, so we can have this
thread online and archived on jira.

Chris

----------------------------------------------------------
This mail sent through IMP: https://webmail.limegroup.com/

Could we schedule an irc session for tomorrow, to discuss how to go about
implementing joins? I wanted to do it today, but was too busy to send out
a notice. Could everyone who is interested possibly make:

http://timeanddate.com/worldclock/fixedtime.html?day=10&month=6&year=2004&hour=18&min=0&sec=0&p1=179
?

I'll show up at that time, and hopefully we can at least get enough people
to get started. We may have to do more on email and in code, as we've got
a ton of time differences going on. But it'd be nice to get started in
real time. I saw the latest log, but still need to think about it some
more - I'll try to get some responses on email before tomorrow.

Chris

On Mon, 7 Jun 2004, Rob Atkinson wrote:

Hi folks...

have had a chance to review the Operations API ideas, and would like to
engage with a few of the issues and concepts.

For a start, I have put initial notes on support for "Community Schemas"
up at
http://docs.codehaus.org/display/GEOTOOLS/Community%2BSchema%2BSupport%2Band%2BComplex%2BTypes

For the Operations API concept the following points strike me:
1) There is a Web Services Resource Framework - coming out of the GRID
computing world - that would be the logical way to invoke such
operations. The UK NERC data grid people
(Andrew Woolf in particular) have done some initial thinking about
bridging GRID and OGC services. I believe they would have an interest
and considerable insight into the requirements of such a capability.

2) Many of the operations suggested exploit the fundamental duality of
FeatureCollections and Coverages.

3) Real world complex (normalised) data stores are often designed to
support only certain query patterns. WFS needs a means to advertise
supported patterns. This is somewhat similar to stroed procedures. We
are looking at a plug-in query strategy, but the operations API is a
useful abstraction - query and processing are both in frame

4) There are significantly useful cases where the "batch processing"
model isnt required. In particular, returned gridded feature (biota
sightings) counts is something we have been able to do quite
cost-effectively on million-row postgres databases in real time.

5) Such operations throw up lots of issues re the semantics of the
"information products" on offer - so it seems logical to first crack the
"community schema" support issue - aka well-known semantics - before
such tricky things like trying to determine a useful meta-language for
arbitrary operations. (consider letting the community schema approach
provide for simple, externally declared, information products, and the
operations become a pluggable configuration issue.)

Rob Atkinson

cholmes@anonymised.com wrote:

>So Jody and I got to talking about the ever elusive joins this evening,
>and fairly confused ourselves, so he wanted me to throw the relevant
>parts of our conversation up on this list.
>
>>From my end it looks like GeoServer is going to drive some development
>work on this. I'd like to introduce Rob Atkinson, from Social Change
>Online. He's been on the GeoServer list for awhile, and he's got some
>funding to do a few improvements. I think most of these are going to
>need to take place in the geotools code base. He'll introduce exactly
>what he wants to get done soon, with some good use cases, but basically
>the brunt of the work is going to involve Joins.
>
>
>jodygarnett: As for the joins there are several games a foot.
>cholmesny: Really? What else?
>jodygarnett: The origional AttributeReader based approach, there is
>hints of an operational API that woudl also need a similar construct.
>cholmesny: what do you mean a 'similar construct' ?
>jodygarnett: Apparently this is something a lot of GIS systems do so
>paul keeps trying to get me to look at it with respect to uDig.
>jodygarnett: The idea that two FeatureSources are used by the same
>"opperation" to generate a single result.
>jodygarnett: The most basic example is an attribute based match.
>cholmesny: Can I then use that as a DataStore?
>cholmesny: And define FeatureTypes from it?
>jodygarnett: Yes you would pretty much have to - or at least write the
>answer to a "Temporary" FeatureStore.
>Should we set up a break out session on either fid-exp or opeprations
>api?
>cholmesny: Actually only the last is required...but we need a way to
>define the featureTypes...
>cholmesny: We definitely need a break out irc for the operations if
>that's how we want to handle Joins.
>jodygarnett: The "easy thing" to do is extend Query (so that it can join
>two other queries). When everything is backed by the same Postgis we
>can produce raw SQL.
>cholmesny: Which are nice, because they can drive devopment - we've
>always fallen short before since it's all been pie in the sky dreams.
>jodygarnett: It is when things are backed by multiple DataStores that my
>mind hurts (I dont know where to send the request).
>jodygarnett: The right place to send it would be a "Catalog" that knew
>both FeatureStores.
>cholmesny: Well, I think you'd have a MultiDataStore, that took two or
>more datastores to be constructed.
>jodygarnett: (I think that is called a Catalog)
>cholmesny: And then would define views - virtual feature stores - of the
>contained attributes.
>cholmesny: You know, I'm not actually sure I agree with your Catalog
>semantic.
>cholmesny: Because to me a catalog has a lot more to do with meta
>information than actually getting the Data itself.
>jodygarnett: I do want to split DataStore into two, to match the
>GridCoverageExchangeAPI. Most methods would stay the same,
>getTypeNames() woudl become a convience method for a MetaData query.
>cholmesny: Like a getContent call i nthe ogc catalog spec gets the meta
>data, not the real data.
>jodygarnett: Understood, I have read those specs now. We get have of
>them implemented for GCE and then port that over to DataStore.
>cholmesny: Ok.
>jodygarnett: I still need that horrible "Catalog" construct I produced
>before - I may rename it to DataRepository - to allow for Lock
>Managmenet across DataStores.
>jodygarnett: Actually the same problem we have for Joining.
>cholmesny: But yeah, this stuff is going to take a lot of thought - it's
>definitely the next level for this stuff.
>cholmesny: Yeah, DataRepository makes more sense to me.
>jodygarnett: Locks and Joins are opperations that cross DataStore
>boundaries.
>cholmesny: right.
>jodygarnett: Wow that was a lot of thought in one going, I better save
>that and attach it to a Jira wish list.
>cholmesny: Yeah, let me think about the operations stuff. Because I do
>need the results to be a FeatureStore for all intents and purposes.
>cholmesny: Need to be able to return DescribeFeatureType, and to submit
>queries against it.
>jodygarnett: Me too, but I need to punt them to disk. ANd much like the
>JAI constructs it is nice to define the abstract (say View) and allow
>the user to choose where to place the explicit (say FeatureStores)
>entires in the chain.
>cholmesny: I guess at a basic level you could have the results be a
>FeatureCollection, and then make a DataStore/FeatureStore out of that.
>jodygarnett: FeatureCollection is bad - memory bound. The JAI approach
>allows content to be "pulled" through the chain as needed.
>jodygarnett: But that is about where my mind gets off.
>jodygarnett: We should take this online, can you cut and paste the
>correct bits of this talk to the list.
>
>So the question is, can someone explain for real how an Operations API
>would handle joins from different datastores? And how that fits in
>with the other stuff we have for the Operations API:
>http://geotools.codehaus.org/Operations+API ? And could someone
>explain how we'd do joins with datastores and attribute readers (I may
>attempt the latter). Rob, could you also forward on your use cases and
>ideas to the geotools list? I think the use cases are key, as we've
>talked about this and even attempted a few designs for it, but we've
>never had anything that actually _uses_ them.
>
>Probably is best to reply to one of the jira issues that Jody just put
>up, like http://jira.codehaus.org/browse/GEOT-175, so we can have this
>thread online and archived on jira.
>
>Chris
>
>----------------------------------------------------------
>This mail sent through IMP: https://webmail.limegroup.com/
>
>

--

I cannot make it then, may I suggest Friday,
http://timeanddate.com/worldclock/fixedtime.html?day=11&month=6&year=2004&hour=15&min=0&sec=0&p1=179

Thanks,
David

On Wed, 2004-06-09 at 12:24, Chris Holmes wrote:

Could we schedule an irc session for tomorrow, to discuss how to go about
implementing joins? I wanted to do it today, but was too busy to send out
a notice. Could everyone who is interested possibly make:

http://timeanddate.com/worldclock/fixedtime.html?day=10&month=6&year=2004&hour=18&min=0&sec=0&p1=179
?

I'll show up at that time, and hopefully we can at least get enough people
to get started. We may have to do more on email and in code, as we've got
a ton of time differences going on. But it'd be nice to get started in
real time. I saw the latest log, but still need to think about it some
more - I'll try to get some responses on email before tomorrow.

Chris

On Mon, 7 Jun 2004, Rob Atkinson wrote:

> Hi folks...
>
> have had a chance to review the Operations API ideas, and would like to
> engage with a few of the issues and concepts.
>
> For a start, I have put initial notes on support for "Community Schemas"
> up at
> http://docs.codehaus.org/display/GEOTOOLS/Community%2BSchema%2BSupport%2Band%2BComplex%2BTypes
>
> For the Operations API concept the following points strike me:
> 1) There is a Web Services Resource Framework - coming out of the GRID
> computing world - that would be the logical way to invoke such
> operations. The UK NERC data grid people
> (Andrew Woolf in particular) have done some initial thinking about
> bridging GRID and OGC services. I believe they would have an interest
> and considerable insight into the requirements of such a capability.
>
> 2) Many of the operations suggested exploit the fundamental duality of
> FeatureCollections and Coverages.
>
> 3) Real world complex (normalised) data stores are often designed to
> support only certain query patterns. WFS needs a means to advertise
> supported patterns. This is somewhat similar to stroed procedures. We
> are looking at a plug-in query strategy, but the operations API is a
> useful abstraction - query and processing are both in frame
>
> 4) There are significantly useful cases where the "batch processing"
> model isnt required. In particular, returned gridded feature (biota
> sightings) counts is something we have been able to do quite
> cost-effectively on million-row postgres databases in real time.
>
> 5) Such operations throw up lots of issues re the semantics of the
> "information products" on offer - so it seems logical to first crack the
> "community schema" support issue - aka well-known semantics - before
> such tricky things like trying to determine a useful meta-language for
> arbitrary operations. (consider letting the community schema approach
> provide for simple, externally declared, information products, and the
> operations become a pluggable configuration issue.)
>
> Rob Atkinson
>
> cholmes@anonymised.com wrote:
>
> >So Jody and I got to talking about the ever elusive joins this evening,
> >and fairly confused ourselves, so he wanted me to throw the relevant
> >parts of our conversation up on this list.
> >
> >>From my end it looks like GeoServer is going to drive some development
> >work on this. I'd like to introduce Rob Atkinson, from Social Change
> >Online. He's been on the GeoServer list for awhile, and he's got some
> >funding to do a few improvements. I think most of these are going to
> >need to take place in the geotools code base. He'll introduce exactly
> >what he wants to get done soon, with some good use cases, but basically
> >the brunt of the work is going to involve Joins.
> >
> >
> >jodygarnett: As for the joins there are several games a foot.
> >cholmesny: Really? What else?
> >jodygarnett: The origional AttributeReader based approach, there is
> >hints of an operational API that woudl also need a similar construct.
> >cholmesny: what do you mean a 'similar construct' ?
> >jodygarnett: Apparently this is something a lot of GIS systems do so
> >paul keeps trying to get me to look at it with respect to uDig.
> >jodygarnett: The idea that two FeatureSources are used by the same
> >"opperation" to generate a single result.
> >jodygarnett: The most basic example is an attribute based match.
> >cholmesny: Can I then use that as a DataStore?
> >cholmesny: And define FeatureTypes from it?
> >jodygarnett: Yes you would pretty much have to - or at least write the
> >answer to a "Temporary" FeatureStore.
> >Should we set up a break out session on either fid-exp or opeprations
> >api?
> >cholmesny: Actually only the last is required...but we need a way to
> >define the featureTypes...
> >cholmesny: We definitely need a break out irc for the operations if
> >that's how we want to handle Joins.
> >jodygarnett: The "easy thing" to do is extend Query (so that it can join
> >two other queries). When everything is backed by the same Postgis we
> >can produce raw SQL.
> >cholmesny: Which are nice, because they can drive devopment - we've
> >always fallen short before since it's all been pie in the sky dreams.
> >jodygarnett: It is when things are backed by multiple DataStores that my
> >mind hurts (I dont know where to send the request).
> >jodygarnett: The right place to send it would be a "Catalog" that knew
> >both FeatureStores.
> >cholmesny: Well, I think you'd have a MultiDataStore, that took two or
> >more datastores to be constructed.
> >jodygarnett: (I think that is called a Catalog)
> >cholmesny: And then would define views - virtual feature stores - of the
> >contained attributes.
> >cholmesny: You know, I'm not actually sure I agree with your Catalog
> >semantic.
> >cholmesny: Because to me a catalog has a lot more to do with meta
> >information than actually getting the Data itself.
> >jodygarnett: I do want to split DataStore into two, to match the
> >GridCoverageExchangeAPI. Most methods would stay the same,
> >getTypeNames() woudl become a convience method for a MetaData query.
> >cholmesny: Like a getContent call i nthe ogc catalog spec gets the meta
> >data, not the real data.
> >jodygarnett: Understood, I have read those specs now. We get have of
> >them implemented for GCE and then port that over to DataStore.
> >cholmesny: Ok.
> >jodygarnett: I still need that horrible "Catalog" construct I produced
> >before - I may rename it to DataRepository - to allow for Lock
> >Managmenet across DataStores.
> >jodygarnett: Actually the same problem we have for Joining.
> >cholmesny: But yeah, this stuff is going to take a lot of thought - it's
> >definitely the next level for this stuff.
> >cholmesny: Yeah, DataRepository makes more sense to me.
> >jodygarnett: Locks and Joins are opperations that cross DataStore
> >boundaries.
> >cholmesny: right.
> >jodygarnett: Wow that was a lot of thought in one going, I better save
> >that and attach it to a Jira wish list.
> >cholmesny: Yeah, let me think about the operations stuff. Because I do
> >need the results to be a FeatureStore for all intents and purposes.
> >cholmesny: Need to be able to return DescribeFeatureType, and to submit
> >queries against it.
> >jodygarnett: Me too, but I need to punt them to disk. ANd much like the
> >JAI constructs it is nice to define the abstract (say View) and allow
> >the user to choose where to place the explicit (say FeatureStores)
> >entires in the chain.
> >cholmesny: I guess at a basic level you could have the results be a
> >FeatureCollection, and then make a DataStore/FeatureStore out of that.
> >jodygarnett: FeatureCollection is bad - memory bound. The JAI approach
> >allows content to be "pulled" through the chain as needed.
> >jodygarnett: But that is about where my mind gets off.
> >jodygarnett: We should take this online, can you cut and paste the
> >correct bits of this talk to the list.
> >
> >So the question is, can someone explain for real how an Operations API
> >would handle joins from different datastores? And how that fits in
> >with the other stuff we have for the Operations API:
> >http://geotools.codehaus.org/Operations+API ? And could someone
> >explain how we'd do joins with datastores and attribute readers (I may
> >attempt the latter). Rob, could you also forward on your use cases and
> >ideas to the geotools list? I think the use cases are key, as we've
> >talked about this and even attempted a few designs for it, but we've
> >never had anything that actually _uses_ them.
> >
> >Probably is best to reply to one of the jira issues that Jody just put
> >up, like http://jira.codehaus.org/browse/GEOT-175, so we can have this
> >thread online and archived on jira.
> >
> >Chris
> >
> >----------------------------------------------------------
> >This mail sent through IMP: https://webmail.limegroup.com/
> >
> >
>

Hmmm... That probably won't work, as it's 5am on a Saturday in Sydney,
and since Rob and his team are the ones who are going to do most of the
implementation work (unless others have time, which is always tough with
geotools developers), we should do a time that they'll be able to make.

Any other suggestions? We could perhaps do two (or more) the issue is
certainly big enough for it...

On Wed, 9 Jun 2004, David Zwiers wrote:

I cannot make it then, may I suggest Friday,
http://timeanddate.com/worldclock/fixedtime.html?day=11&month=6&year=2004&hour=15&min=0&sec=0&p1=179

Thanks,
David

On Wed, 2004-06-09 at 12:24, Chris Holmes wrote:
> Could we schedule an irc session for tomorrow, to discuss how to go about
> implementing joins? I wanted to do it today, but was too busy to send out
> a notice. Could everyone who is interested possibly make:
>
> http://timeanddate.com/worldclock/fixedtime.html?day=10&month=6&year=2004&hour=18&min=0&sec=0&p1=179
> ?
>
> I'll show up at that time, and hopefully we can at least get enough people
> to get started. We may have to do more on email and in code, as we've got
> a ton of time differences going on. But it'd be nice to get started in
> real time. I saw the latest log, but still need to think about it some
> more - I'll try to get some responses on email before tomorrow.
>
> Chris
>
> On Mon, 7 Jun 2004, Rob Atkinson wrote:
>
> > Hi folks...
> >
> > have had a chance to review the Operations API ideas, and would like to
> > engage with a few of the issues and concepts.
> >
> > For a start, I have put initial notes on support for "Community Schemas"
> > up at
> > http://docs.codehaus.org/display/GEOTOOLS/Community%2BSchema%2BSupport%2Band%2BComplex%2BTypes
> >
> > For the Operations API concept the following points strike me:
> > 1) There is a Web Services Resource Framework - coming out of the GRID
> > computing world - that would be the logical way to invoke such
> > operations. The UK NERC data grid people
> > (Andrew Woolf in particular) have done some initial thinking about
> > bridging GRID and OGC services. I believe they would have an interest
> > and considerable insight into the requirements of such a capability.
> >
> > 2) Many of the operations suggested exploit the fundamental duality of
> > FeatureCollections and Coverages.
> >
> > 3) Real world complex (normalised) data stores are often designed to
> > support only certain query patterns. WFS needs a means to advertise
> > supported patterns. This is somewhat similar to stroed procedures. We
> > are looking at a plug-in query strategy, but the operations API is a
> > useful abstraction - query and processing are both in frame
> >
> > 4) There are significantly useful cases where the "batch processing"
> > model isnt required. In particular, returned gridded feature (biota
> > sightings) counts is something we have been able to do quite
> > cost-effectively on million-row postgres databases in real time.
> >
> > 5) Such operations throw up lots of issues re the semantics of the
> > "information products" on offer - so it seems logical to first crack the
> > "community schema" support issue - aka well-known semantics - before
> > such tricky things like trying to determine a useful meta-language for
> > arbitrary operations. (consider letting the community schema approach
> > provide for simple, externally declared, information products, and the
> > operations become a pluggable configuration issue.)
> >
> > Rob Atkinson
> >
> > cholmes@anonymised.com wrote:
> >
> > >So Jody and I got to talking about the ever elusive joins this evening,
> > >and fairly confused ourselves, so he wanted me to throw the relevant
> > >parts of our conversation up on this list.
> > >
> > >>From my end it looks like GeoServer is going to drive some development
> > >work on this. I'd like to introduce Rob Atkinson, from Social Change
> > >Online. He's been on the GeoServer list for awhile, and he's got some
> > >funding to do a few improvements. I think most of these are going to
> > >need to take place in the geotools code base. He'll introduce exactly
> > >what he wants to get done soon, with some good use cases, but basically
> > >the brunt of the work is going to involve Joins.
> > >
> > >
> > >jodygarnett: As for the joins there are several games a foot.
> > >cholmesny: Really? What else?
> > >jodygarnett: The origional AttributeReader based approach, there is
> > >hints of an operational API that woudl also need a similar construct.
> > >cholmesny: what do you mean a 'similar construct' ?
> > >jodygarnett: Apparently this is something a lot of GIS systems do so
> > >paul keeps trying to get me to look at it with respect to uDig.
> > >jodygarnett: The idea that two FeatureSources are used by the same
> > >"opperation" to generate a single result.
> > >jodygarnett: The most basic example is an attribute based match.
> > >cholmesny: Can I then use that as a DataStore?
> > >cholmesny: And define FeatureTypes from it?
> > >jodygarnett: Yes you would pretty much have to - or at least write the
> > >answer to a "Temporary" FeatureStore.
> > >Should we set up a break out session on either fid-exp or opeprations
> > >api?
> > >cholmesny: Actually only the last is required...but we need a way to
> > >define the featureTypes...
> > >cholmesny: We definitely need a break out irc for the operations if
> > >that's how we want to handle Joins.
> > >jodygarnett: The "easy thing" to do is extend Query (so that it can join
> > >two other queries). When everything is backed by the same Postgis we
> > >can produce raw SQL.
> > >cholmesny: Which are nice, because they can drive devopment - we've
> > >always fallen short before since it's all been pie in the sky dreams.
> > >jodygarnett: It is when things are backed by multiple DataStores that my
> > >mind hurts (I dont know where to send the request).
> > >jodygarnett: The right place to send it would be a "Catalog" that knew
> > >both FeatureStores.
> > >cholmesny: Well, I think you'd have a MultiDataStore, that took two or
> > >more datastores to be constructed.
> > >jodygarnett: (I think that is called a Catalog)
> > >cholmesny: And then would define views - virtual feature stores - of the
> > >contained attributes.
> > >cholmesny: You know, I'm not actually sure I agree with your Catalog
> > >semantic.
> > >cholmesny: Because to me a catalog has a lot more to do with meta
> > >information than actually getting the Data itself.
> > >jodygarnett: I do want to split DataStore into two, to match the
> > >GridCoverageExchangeAPI. Most methods would stay the same,
> > >getTypeNames() woudl become a convience method for a MetaData query.
> > >cholmesny: Like a getContent call i nthe ogc catalog spec gets the meta
> > >data, not the real data.
> > >jodygarnett: Understood, I have read those specs now. We get have of
> > >them implemented for GCE and then port that over to DataStore.
> > >cholmesny: Ok.
> > >jodygarnett: I still need that horrible "Catalog" construct I produced
> > >before - I may rename it to DataRepository - to allow for Lock
> > >Managmenet across DataStores.
> > >jodygarnett: Actually the same problem we have for Joining.
> > >cholmesny: But yeah, this stuff is going to take a lot of thought - it's
> > >definitely the next level for this stuff.
> > >cholmesny: Yeah, DataRepository makes more sense to me.
> > >jodygarnett: Locks and Joins are opperations that cross DataStore
> > >boundaries.
> > >cholmesny: right.
> > >jodygarnett: Wow that was a lot of thought in one going, I better save
> > >that and attach it to a Jira wish list.
> > >cholmesny: Yeah, let me think about the operations stuff. Because I do
> > >need the results to be a FeatureStore for all intents and purposes.
> > >cholmesny: Need to be able to return DescribeFeatureType, and to submit
> > >queries against it.
> > >jodygarnett: Me too, but I need to punt them to disk. ANd much like the
> > >JAI constructs it is nice to define the abstract (say View) and allow
> > >the user to choose where to place the explicit (say FeatureStores)
> > >entires in the chain.
> > >cholmesny: I guess at a basic level you could have the results be a
> > >FeatureCollection, and then make a DataStore/FeatureStore out of that.
> > >jodygarnett: FeatureCollection is bad - memory bound. The JAI approach
> > >allows content to be "pulled" through the chain as needed.
> > >jodygarnett: But that is about where my mind gets off.
> > >jodygarnett: We should take this online, can you cut and paste the
> > >correct bits of this talk to the list.
> > >
> > >So the question is, can someone explain for real how an Operations API
> > >would handle joins from different datastores? And how that fits in
> > >with the other stuff we have for the Operations API:
> > >http://geotools.codehaus.org/Operations+API ? And could someone
> > >explain how we'd do joins with datastores and attribute readers (I may
> > >attempt the latter). Rob, could you also forward on your use cases and
> > >ideas to the geotools list? I think the use cases are key, as we've
> > >talked about this and even attempted a few designs for it, but we've
> > >never had anything that actually _uses_ them.
> > >
> > >Probably is best to reply to one of the jira issues that Jody just put
> > >up, like http://jira.codehaus.org/browse/GEOT-175, so we can have this
> > >thread online and archived on jira.
> > >
> > >Chris
> > >
> > >----------------------------------------------------------
> > >This mail sent through IMP: https://webmail.limegroup.com/
> > >
> > >
> >

-------------------------------------------------------
This SF.Net email is sponsored by: GNOME Foundation
Hackers Unite! GUADEC: The world's #1 Open Source Desktop Event.
GNOME Users and Developers European Conference, 28-30th June in Norway
http://2004/guadec.org
_______________________________________________
Geotools-devel mailing list
Geotools-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/geotools-devel

--

You time works fine for me on monday, I don't have any conflicting
meetings.

David

On Wed, 2004-06-09 at 12:55, Chris Holmes wrote:

Hmmm... That probably won't work, as it's 5am on a Saturday in Sydney,
and since Rob and his team are the ones who are going to do most of the
implementation work (unless others have time, which is always tough with
geotools developers), we should do a time that they'll be able to make.

Any other suggestions? We could perhaps do two (or more) the issue is
certainly big enough for it...

On Wed, 9 Jun 2004, David Zwiers wrote:

> I cannot make it then, may I suggest Friday,
> http://timeanddate.com/worldclock/fixedtime.html?day=11&month=6&year=2004&hour=15&min=0&sec=0&p1=179
>
> Thanks,
> David
>
> On Wed, 2004-06-09 at 12:24, Chris Holmes wrote:
> > Could we schedule an irc session for tomorrow, to discuss how to go about
> > implementing joins? I wanted to do it today, but was too busy to send out
> > a notice. Could everyone who is interested possibly make:
> >
> > http://timeanddate.com/worldclock/fixedtime.html?day=10&month=6&year=2004&hour=18&min=0&sec=0&p1=179
> > ?
> >
> > I'll show up at that time, and hopefully we can at least get enough people
> > to get started. We may have to do more on email and in code, as we've got
> > a ton of time differences going on. But it'd be nice to get started in
> > real time. I saw the latest log, but still need to think about it some
> > more - I'll try to get some responses on email before tomorrow.
> >
> > Chris
> >
> > On Mon, 7 Jun 2004, Rob Atkinson wrote:
> >
> > > Hi folks...
> > >
> > > have had a chance to review the Operations API ideas, and would like to
> > > engage with a few of the issues and concepts.
> > >
> > > For a start, I have put initial notes on support for "Community Schemas"
> > > up at
> > > http://docs.codehaus.org/display/GEOTOOLS/Community%2BSchema%2BSupport%2Band%2BComplex%2BTypes
> > >
> > > For the Operations API concept the following points strike me:
> > > 1) There is a Web Services Resource Framework - coming out of the GRID
> > > computing world - that would be the logical way to invoke such
> > > operations. The UK NERC data grid people
> > > (Andrew Woolf in particular) have done some initial thinking about
> > > bridging GRID and OGC services. I believe they would have an interest
> > > and considerable insight into the requirements of such a capability.
> > >
> > > 2) Many of the operations suggested exploit the fundamental duality of
> > > FeatureCollections and Coverages.
> > >
> > > 3) Real world complex (normalised) data stores are often designed to
> > > support only certain query patterns. WFS needs a means to advertise
> > > supported patterns. This is somewhat similar to stroed procedures. We
> > > are looking at a plug-in query strategy, but the operations API is a
> > > useful abstraction - query and processing are both in frame
> > >
> > > 4) There are significantly useful cases where the "batch processing"
> > > model isnt required. In particular, returned gridded feature (biota
> > > sightings) counts is something we have been able to do quite
> > > cost-effectively on million-row postgres databases in real time.
> > >
> > > 5) Such operations throw up lots of issues re the semantics of the
> > > "information products" on offer - so it seems logical to first crack the
> > > "community schema" support issue - aka well-known semantics - before
> > > such tricky things like trying to determine a useful meta-language for
> > > arbitrary operations. (consider letting the community schema approach
> > > provide for simple, externally declared, information products, and the
> > > operations become a pluggable configuration issue.)
> > >
> > > Rob Atkinson
> > >
> > > cholmes@anonymised.com wrote:
> > >
> > > >So Jody and I got to talking about the ever elusive joins this evening,
> > > >and fairly confused ourselves, so he wanted me to throw the relevant
> > > >parts of our conversation up on this list.
> > > >
> > > >>From my end it looks like GeoServer is going to drive some development
> > > >work on this. I'd like to introduce Rob Atkinson, from Social Change
> > > >Online. He's been on the GeoServer list for awhile, and he's got some
> > > >funding to do a few improvements. I think most of these are going to
> > > >need to take place in the geotools code base. He'll introduce exactly
> > > >what he wants to get done soon, with some good use cases, but basically
> > > >the brunt of the work is going to involve Joins.
> > > >
> > > >
> > > >jodygarnett: As for the joins there are several games a foot.
> > > >cholmesny: Really? What else?
> > > >jodygarnett: The origional AttributeReader based approach, there is
> > > >hints of an operational API that woudl also need a similar construct.
> > > >cholmesny: what do you mean a 'similar construct' ?
> > > >jodygarnett: Apparently this is something a lot of GIS systems do so
> > > >paul keeps trying to get me to look at it with respect to uDig.
> > > >jodygarnett: The idea that two FeatureSources are used by the same
> > > >"opperation" to generate a single result.
> > > >jodygarnett: The most basic example is an attribute based match.
> > > >cholmesny: Can I then use that as a DataStore?
> > > >cholmesny: And define FeatureTypes from it?
> > > >jodygarnett: Yes you would pretty much have to - or at least write the
> > > >answer to a "Temporary" FeatureStore.
> > > >Should we set up a break out session on either fid-exp or opeprations
> > > >api?
> > > >cholmesny: Actually only the last is required...but we need a way to
> > > >define the featureTypes...
> > > >cholmesny: We definitely need a break out irc for the operations if
> > > >that's how we want to handle Joins.
> > > >jodygarnett: The "easy thing" to do is extend Query (so that it can join
> > > >two other queries). When everything is backed by the same Postgis we
> > > >can produce raw SQL.
> > > >cholmesny: Which are nice, because they can drive devopment - we've
> > > >always fallen short before since it's all been pie in the sky dreams.
> > > >jodygarnett: It is when things are backed by multiple DataStores that my
> > > >mind hurts (I dont know where to send the request).
> > > >jodygarnett: The right place to send it would be a "Catalog" that knew
> > > >both FeatureStores.
> > > >cholmesny: Well, I think you'd have a MultiDataStore, that took two or
> > > >more datastores to be constructed.
> > > >jodygarnett: (I think that is called a Catalog)
> > > >cholmesny: And then would define views - virtual feature stores - of the
> > > >contained attributes.
> > > >cholmesny: You know, I'm not actually sure I agree with your Catalog
> > > >semantic.
> > > >cholmesny: Because to me a catalog has a lot more to do with meta
> > > >information than actually getting the Data itself.
> > > >jodygarnett: I do want to split DataStore into two, to match the
> > > >GridCoverageExchangeAPI. Most methods would stay the same,
> > > >getTypeNames() woudl become a convience method for a MetaData query.
> > > >cholmesny: Like a getContent call i nthe ogc catalog spec gets the meta
> > > >data, not the real data.
> > > >jodygarnett: Understood, I have read those specs now. We get have of
> > > >them implemented for GCE and then port that over to DataStore.
> > > >cholmesny: Ok.
> > > >jodygarnett: I still need that horrible "Catalog" construct I produced
> > > >before - I may rename it to DataRepository - to allow for Lock
> > > >Managmenet across DataStores.
> > > >jodygarnett: Actually the same problem we have for Joining.
> > > >cholmesny: But yeah, this stuff is going to take a lot of thought - it's
> > > >definitely the next level for this stuff.
> > > >cholmesny: Yeah, DataRepository makes more sense to me.
> > > >jodygarnett: Locks and Joins are opperations that cross DataStore
> > > >boundaries.
> > > >cholmesny: right.
> > > >jodygarnett: Wow that was a lot of thought in one going, I better save
> > > >that and attach it to a Jira wish list.
> > > >cholmesny: Yeah, let me think about the operations stuff. Because I do
> > > >need the results to be a FeatureStore for all intents and purposes.
> > > >cholmesny: Need to be able to return DescribeFeatureType, and to submit
> > > >queries against it.
> > > >jodygarnett: Me too, but I need to punt them to disk. ANd much like the
> > > >JAI constructs it is nice to define the abstract (say View) and allow
> > > >the user to choose where to place the explicit (say FeatureStores)
> > > >entires in the chain.
> > > >cholmesny: I guess at a basic level you could have the results be a
> > > >FeatureCollection, and then make a DataStore/FeatureStore out of that.
> > > >jodygarnett: FeatureCollection is bad - memory bound. The JAI approach
> > > >allows content to be "pulled" through the chain as needed.
> > > >jodygarnett: But that is about where my mind gets off.
> > > >jodygarnett: We should take this online, can you cut and paste the
> > > >correct bits of this talk to the list.
> > > >
> > > >So the question is, can someone explain for real how an Operations API
> > > >would handle joins from different datastores? And how that fits in
> > > >with the other stuff we have for the Operations API:
> > > >http://geotools.codehaus.org/Operations+API ? And could someone
> > > >explain how we'd do joins with datastores and attribute readers (I may
> > > >attempt the latter). Rob, could you also forward on your use cases and
> > > >ideas to the geotools list? I think the use cases are key, as we've
> > > >talked about this and even attempted a few designs for it, but we've
> > > >never had anything that actually _uses_ them.
> > > >
> > > >Probably is best to reply to one of the jira issues that Jody just put
> > > >up, like http://jira.codehaus.org/browse/GEOT-175, so we can have this
> > > >thread online and archived on jira.
> > > >
> > > >Chris
> > > >
> > > >----------------------------------------------------------
> > > >This mail sent through IMP: https://webmail.limegroup.com/
> > > >
> > > >
> > >
>
>
>
> -------------------------------------------------------
> This SF.Net email is sponsored by: GNOME Foundation
> Hackers Unite! GUADEC: The world's #1 Open Source Desktop Event.
> GNOME Users and Developers European Conference, 28-30th June in Norway
> http://2004/guadec.org
> _______________________________________________
> Geotools-devel mailing list
> Geotools-devel@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/geotools-devel
>

Unfortunately I cant make monday - I off to UK for a round of meetings and need to kick off some work with Peter before I go.

Rob

David Zwiers wrote:

You time works fine for me on monday, I don't have any conflicting
meetings.

David

On Wed, 2004-06-09 at 12:55, Chris Holmes wrote:

Hmmm... That probably won't work, as it's 5am on a Saturday in Sydney, and since Rob and his team are the ones who are going to do most of the implementation work (unless others have time, which is always tough with geotools developers), we should do a time that they'll be able to make.

Any other suggestions? We could perhaps do two (or more) the issue is certainly big enough for it...

On Wed, 9 Jun 2004, David Zwiers wrote:

I cannot make it then, may I suggest Friday,
http://timeanddate.com/worldclock/fixedtime.html?day=11&month=6&year=2004&hour=15&min=0&sec=0&p1=179

Thanks, David

On Wed, 2004-06-09 at 12:24, Chris Holmes wrote:
     

Could we schedule an irc session for tomorrow, to discuss how to go about implementing joins? I wanted to do it today, but was too busy to send out a notice. Could everyone who is interested possibly make:

http://timeanddate.com/worldclock/fixedtime.html?day=10&month=6&year=2004&hour=18&min=0&sec=0&p1=179
?

I'll show up at that time, and hopefully we can at least get enough people to get started. We may have to do more on email and in code, as we've got a ton of time differences going on. But it'd be nice to get started in real time. I saw the latest log, but still need to think about it some more - I'll try to get some responses on email before tomorrow.

Chris

On Mon, 7 Jun 2004, Rob Atkinson wrote:

Hi folks...

have had a chance to review the Operations API ideas, and would like to engage with a few of the issues and concepts.

For a start, I have put initial notes on support for "Community Schemas" up at http://docs.codehaus.org/display/GEOTOOLS/Community%2BSchema%2BSupport%2Band%2BComplex%2BTypes

For the Operations API concept the following points strike me:
1) There is a Web Services Resource Framework - coming out of the GRID computing world - that would be the logical way to invoke such operations. The UK NERC data grid people
(Andrew Woolf in particular) have done some initial thinking about bridging GRID and OGC services. I believe they would have an interest and considerable insight into the requirements of such a capability.

2) Many of the operations suggested exploit the fundamental duality of FeatureCollections and Coverages.

3) Real world complex (normalised) data stores are often designed to support only certain query patterns. WFS needs a means to advertise supported patterns. This is somewhat similar to stroed procedures. We are looking at a plug-in query strategy, but the operations API is a useful abstraction - query and processing are both in frame

4) There are significantly useful cases where the "batch processing" model isnt required. In particular, returned gridded feature (biota sightings) counts is something we have been able to do quite cost-effectively on million-row postgres databases in real time.

5) Such operations throw up lots of issues re the semantics of the "information products" on offer - so it seems logical to first crack the "community schema" support issue - aka well-known semantics - before such tricky things like trying to determine a useful meta-language for arbitrary operations. (consider letting the community schema approach provide for simple, externally declared, information products, and the operations become a pluggable configuration issue.)

Rob Atkinson

cholmes@anonymised.com wrote:

So Jody and I got to talking about the ever elusive joins this evening,
and fairly confused ourselves, so he wanted me to throw the relevant
parts of our conversation up on this list.

From my end it looks like GeoServer is going to drive some development

work on this. I'd like to introduce Rob Atkinson, from Social Change
Online. He's been on the GeoServer list for awhile, and he's got some
funding to do a few improvements. I think most of these are going to
need to take place in the geotools code base. He'll introduce exactly
what he wants to get done soon, with some good use cases, but basically
the brunt of the work is going to involve Joins.

jodygarnett: As for the joins there are several games a foot.
cholmesny: Really? What else?
jodygarnett: The origional AttributeReader based approach, there is
hints of an operational API that woudl also need a similar construct.
cholmesny: what do you mean a 'similar construct' ?
jodygarnett: Apparently this is something a lot of GIS systems do so
paul keeps trying to get me to look at it with respect to uDig.
jodygarnett: The idea that two FeatureSources are used by the same
"opperation" to generate a single result.
jodygarnett: The most basic example is an attribute based match.
cholmesny: Can I then use that as a DataStore?
cholmesny: And define FeatureTypes from it?
jodygarnett: Yes you would pretty much have to - or at least write the
answer to a "Temporary" FeatureStore.
Should we set up a break out session on either fid-exp or opeprations
api?
cholmesny: Actually only the last is required...but we need a way to
define the featureTypes...
cholmesny: We definitely need a break out irc for the operations if
that's how we want to handle Joins.
jodygarnett: The "easy thing" to do is extend Query (so that it can join
two other queries). When everything is backed by the same Postgis we
can produce raw SQL.
cholmesny: Which are nice, because they can drive devopment - we've
always fallen short before since it's all been pie in the sky dreams.
jodygarnett: It is when things are backed by multiple DataStores that my
mind hurts (I dont know where to send the request).
jodygarnett: The right place to send it would be a "Catalog" that knew
both FeatureStores.
cholmesny: Well, I think you'd have a MultiDataStore, that took two or
more datastores to be constructed.
jodygarnett: (I think that is called a Catalog)
cholmesny: And then would define views - virtual feature stores - of the
contained attributes.
cholmesny: You know, I'm not actually sure I agree with your Catalog
semantic.
cholmesny: Because to me a catalog has a lot more to do with meta
information than actually getting the Data itself.
jodygarnett: I do want to split DataStore into two, to match the
GridCoverageExchangeAPI. Most methods would stay the same,
getTypeNames() woudl become a convience method for a MetaData query.
cholmesny: Like a getContent call i nthe ogc catalog spec gets the meta
data, not the real data.
jodygarnett: Understood, I have read those specs now. We get have of
them implemented for GCE and then port that over to DataStore.
cholmesny: Ok.
jodygarnett: I still need that horrible "Catalog" construct I produced
before - I may rename it to DataRepository - to allow for Lock
Managmenet across DataStores.
jodygarnett: Actually the same problem we have for Joining.
cholmesny: But yeah, this stuff is going to take a lot of thought - it's
definitely the next level for this stuff.
cholmesny: Yeah, DataRepository makes more sense to me.
jodygarnett: Locks and Joins are opperations that cross DataStore
boundaries.
cholmesny: right.
jodygarnett: Wow that was a lot of thought in one going, I better save
that and attach it to a Jira wish list.
cholmesny: Yeah, let me think about the operations stuff. Because I do
need the results to be a FeatureStore for all intents and purposes.
cholmesny: Need to be able to return DescribeFeatureType, and to submit
queries against it.
jodygarnett: Me too, but I need to punt them to disk. ANd much like the
JAI constructs it is nice to define the abstract (say View) and allow
the user to choose where to place the explicit (say FeatureStores)
entires in the chain.
cholmesny: I guess at a basic level you could have the results be a
FeatureCollection, and then make a DataStore/FeatureStore out of that.
jodygarnett: FeatureCollection is bad - memory bound. The JAI approach
allows content to be "pulled" through the chain as needed.
jodygarnett: But that is about where my mind gets off.
jodygarnett: We should take this online, can you cut and paste the
correct bits of this talk to the list.

So the question is, can someone explain for real how an Operations API
would handle joins from different datastores? And how that fits in
with the other stuff we have for the Operations API:
http://geotools.codehaus.org/Operations+API ? And could someone
explain how we'd do joins with datastores and attribute readers (I may
attempt the latter). Rob, could you also forward on your use cases and
ideas to the geotools list? I think the use cases are key, as we've
talked about this and even attempted a few designs for it, but we've
never had anything that actually _uses_ them.

Probably is best to reply to one of the jira issues that Jody just put
up, like http://jira.codehaus.org/browse/GEOT-175, so we can have this
thread online and archived on jira.

Chris

----------------------------------------------------------
This mail sent through IMP: https://webmail.limegroup.com/

-------------------------------------------------------
This SF.Net email is sponsored by: GNOME Foundation
Hackers Unite! GUADEC: The world's #1 Open Source Desktop Event.
GNOME Users and Developers European Conference, 28-30th June in Norway
http://2004/guadec.org
_______________________________________________
Geotools-devel mailing list
Geotools-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/geotools-devel

Ok, I'll meet with Rob and whoever else can make it on tommorrow (or we
could try to jump on right now). And then we can talk more on monday, as
part of or after the main geotools irc meeting. Sorry I've been off the
ball on setting this up, we really should have done it today, but I was
wrapped up in a conference myself.

Chris

On Thu, 10 Jun 2004, Rob Atkinson wrote:

Unfortunately I cant make monday - I off to UK for a round of meetings
and need to kick off some work with Peter before I go.

Rob

David Zwiers wrote:

>You time works fine for me on monday, I don't have any conflicting
>meetings.
>
>David
>
>On Wed, 2004-06-09 at 12:55, Chris Holmes wrote:
>
>
>>Hmmm... That probably won't work, as it's 5am on a Saturday in Sydney,
>>and since Rob and his team are the ones who are going to do most of the
>>implementation work (unless others have time, which is always tough with
>>geotools developers), we should do a time that they'll be able to make.
>>
>>Any other suggestions? We could perhaps do two (or more) the issue is
>>certainly big enough for it...
>>
>>On Wed, 9 Jun 2004, David Zwiers wrote:
>>
>>
>>
>>>I cannot make it then, may I suggest Friday,
>>>http://timeanddate.com/worldclock/fixedtime.html?day=11&month=6&year=2004&hour=15&min=0&sec=0&p1=179
>>>
>>>Thanks,
>>>David
>>>
>>>On Wed, 2004-06-09 at 12:24, Chris Holmes wrote:
>>>
>>>
>>>>Could we schedule an irc session for tomorrow, to discuss how to go about
>>>>implementing joins? I wanted to do it today, but was too busy to send out
>>>>a notice. Could everyone who is interested possibly make:
>>>>
>>>>http://timeanddate.com/worldclock/fixedtime.html?day=10&month=6&year=2004&hour=18&min=0&sec=0&p1=179
>>>>?
>>>>
>>>>I'll show up at that time, and hopefully we can at least get enough people
>>>>to get started. We may have to do more on email and in code, as we've got
>>>>a ton of time differences going on. But it'd be nice to get started in
>>>>real time. I saw the latest log, but still need to think about it some
>>>>more - I'll try to get some responses on email before tomorrow.
>>>>
>>>>Chris
>>>>
>>>>On Mon, 7 Jun 2004, Rob Atkinson wrote:
>>>>
>>>>
>>>>
>>>>>Hi folks...
>>>>>
>>>>>have had a chance to review the Operations API ideas, and would like to
>>>>>engage with a few of the issues and concepts.
>>>>>
>>>>>For a start, I have put initial notes on support for "Community Schemas"
>>>>>up at
>>>>>http://docs.codehaus.org/display/GEOTOOLS/Community%2BSchema%2BSupport%2Band%2BComplex%2BTypes
>>>>>
>>>>>For the Operations API concept the following points strike me:
>>>>>1) There is a Web Services Resource Framework - coming out of the GRID
>>>>>computing world - that would be the logical way to invoke such
>>>>>operations. The UK NERC data grid people
>>>>>(Andrew Woolf in particular) have done some initial thinking about
>>>>>bridging GRID and OGC services. I believe they would have an interest
>>>>>and considerable insight into the requirements of such a capability.
>>>>>
>>>>>2) Many of the operations suggested exploit the fundamental duality of
>>>>>FeatureCollections and Coverages.
>>>>>
>>>>>3) Real world complex (normalised) data stores are often designed to
>>>>>support only certain query patterns. WFS needs a means to advertise
>>>>>supported patterns. This is somewhat similar to stroed procedures. We
>>>>>are looking at a plug-in query strategy, but the operations API is a
>>>>>useful abstraction - query and processing are both in frame
>>>>>
>>>>>4) There are significantly useful cases where the "batch processing"
>>>>>model isnt required. In particular, returned gridded feature (biota
>>>>>sightings) counts is something we have been able to do quite
>>>>>cost-effectively on million-row postgres databases in real time.
>>>>>
>>>>>5) Such operations throw up lots of issues re the semantics of the
>>>>>"information products" on offer - so it seems logical to first crack the
>>>>>"community schema" support issue - aka well-known semantics - before
>>>>>such tricky things like trying to determine a useful meta-language for
>>>>>arbitrary operations. (consider letting the community schema approach
>>>>>provide for simple, externally declared, information products, and the
>>>>>operations become a pluggable configuration issue.)
>>>>>
>>>>>Rob Atkinson
>>>>>
>>>>>cholmes@anonymised.com wrote:
>>>>>
>>>>>
>>>>>
>>>>>>So Jody and I got to talking about the ever elusive joins this evening,
>>>>>>and fairly confused ourselves, so he wanted me to throw the relevant
>>>>>>parts of our conversation up on this list.
>>>>>>
>>>>>>>From my end it looks like GeoServer is going to drive some development
>>>>>>work on this. I'd like to introduce Rob Atkinson, from Social Change
>>>>>>Online. He's been on the GeoServer list for awhile, and he's got some
>>>>>>funding to do a few improvements. I think most of these are going to
>>>>>>need to take place in the geotools code base. He'll introduce exactly
>>>>>>what he wants to get done soon, with some good use cases, but basically
>>>>>>the brunt of the work is going to involve Joins.
>>>>>>
>>>>>>
>>>>>>jodygarnett: As for the joins there are several games a foot.
>>>>>>cholmesny: Really? What else?
>>>>>>jodygarnett: The origional AttributeReader based approach, there is
>>>>>>hints of an operational API that woudl also need a similar construct.
>>>>>>cholmesny: what do you mean a 'similar construct' ?
>>>>>>jodygarnett: Apparently this is something a lot of GIS systems do so
>>>>>>paul keeps trying to get me to look at it with respect to uDig.
>>>>>>jodygarnett: The idea that two FeatureSources are used by the same
>>>>>>"opperation" to generate a single result.
>>>>>>jodygarnett: The most basic example is an attribute based match.
>>>>>>cholmesny: Can I then use that as a DataStore?
>>>>>>cholmesny: And define FeatureTypes from it?
>>>>>>jodygarnett: Yes you would pretty much have to - or at least write the
>>>>>>answer to a "Temporary" FeatureStore.
>>>>>>Should we set up a break out session on either fid-exp or opeprations
>>>>>>api?
>>>>>>cholmesny: Actually only the last is required...but we need a way to
>>>>>>define the featureTypes...
>>>>>>cholmesny: We definitely need a break out irc for the operations if
>>>>>>that's how we want to handle Joins.
>>>>>>jodygarnett: The "easy thing" to do is extend Query (so that it can join
>>>>>>two other queries). When everything is backed by the same Postgis we
>>>>>>can produce raw SQL.
>>>>>>cholmesny: Which are nice, because they can drive devopment - we've
>>>>>>always fallen short before since it's all been pie in the sky dreams.
>>>>>>jodygarnett: It is when things are backed by multiple DataStores that my
>>>>>>mind hurts (I dont know where to send the request).
>>>>>>jodygarnett: The right place to send it would be a "Catalog" that knew
>>>>>>both FeatureStores.
>>>>>>cholmesny: Well, I think you'd have a MultiDataStore, that took two or
>>>>>>more datastores to be constructed.
>>>>>>jodygarnett: (I think that is called a Catalog)
>>>>>>cholmesny: And then would define views - virtual feature stores - of the
>>>>>>contained attributes.
>>>>>>cholmesny: You know, I'm not actually sure I agree with your Catalog
>>>>>>semantic.
>>>>>>cholmesny: Because to me a catalog has a lot more to do with meta
>>>>>>information than actually getting the Data itself.
>>>>>>jodygarnett: I do want to split DataStore into two, to match the
>>>>>>GridCoverageExchangeAPI. Most methods would stay the same,
>>>>>>getTypeNames() woudl become a convience method for a MetaData query.
>>>>>>cholmesny: Like a getContent call i nthe ogc catalog spec gets the meta
>>>>>>data, not the real data.
>>>>>>jodygarnett: Understood, I have read those specs now. We get have of
>>>>>>them implemented for GCE and then port that over to DataStore.
>>>>>>cholmesny: Ok.
>>>>>>jodygarnett: I still need that horrible "Catalog" construct I produced
>>>>>>before - I may rename it to DataRepository - to allow for Lock
>>>>>>Managmenet across DataStores.
>>>>>>jodygarnett: Actually the same problem we have for Joining.
>>>>>>cholmesny: But yeah, this stuff is going to take a lot of thought - it's
>>>>>>definitely the next level for this stuff.
>>>>>>cholmesny: Yeah, DataRepository makes more sense to me.
>>>>>>jodygarnett: Locks and Joins are opperations that cross DataStore
>>>>>>boundaries.
>>>>>>cholmesny: right.
>>>>>>jodygarnett: Wow that was a lot of thought in one going, I better save
>>>>>>that and attach it to a Jira wish list.
>>>>>>cholmesny: Yeah, let me think about the operations stuff. Because I do
>>>>>>need the results to be a FeatureStore for all intents and purposes.
>>>>>>cholmesny: Need to be able to return DescribeFeatureType, and to submit
>>>>>>queries against it.
>>>>>>jodygarnett: Me too, but I need to punt them to disk. ANd much like the
>>>>>>JAI constructs it is nice to define the abstract (say View) and allow
>>>>>>the user to choose where to place the explicit (say FeatureStores)
>>>>>>entires in the chain.
>>>>>>cholmesny: I guess at a basic level you could have the results be a
>>>>>>FeatureCollection, and then make a DataStore/FeatureStore out of that.
>>>>>>jodygarnett: FeatureCollection is bad - memory bound. The JAI approach
>>>>>>allows content to be "pulled" through the chain as needed.
>>>>>>jodygarnett: But that is about where my mind gets off.
>>>>>>jodygarnett: We should take this online, can you cut and paste the
>>>>>>correct bits of this talk to the list.
>>>>>>
>>>>>>So the question is, can someone explain for real how an Operations API
>>>>>>would handle joins from different datastores? And how that fits in
>>>>>>with the other stuff we have for the Operations API:
>>>>>>http://geotools.codehaus.org/Operations+API ? And could someone
>>>>>>explain how we'd do joins with datastores and attribute readers (I may
>>>>>>attempt the latter). Rob, could you also forward on your use cases and
>>>>>>ideas to the geotools list? I think the use cases are key, as we've
>>>>>>talked about this and even attempted a few designs for it, but we've
>>>>>>never had anything that actually _uses_ them.
>>>>>>
>>>>>>Probably is best to reply to one of the jira issues that Jody just put
>>>>>>up, like http://jira.codehaus.org/browse/GEOT-175, so we can have this
>>>>>>thread online and archived on jira.
>>>>>>
>>>>>>Chris
>>>>>>
>>>>>>----------------------------------------------------------
>>>>>>This mail sent through IMP: https://webmail.limegroup.com/
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>
>>>-------------------------------------------------------
>>>This SF.Net email is sponsored by: GNOME Foundation
>>>Hackers Unite! GUADEC: The world's #1 Open Source Desktop Event.
>>>GNOME Users and Developers European Conference, 28-30th June in Norway
>>>http://2004/guadec.org
>>>_______________________________________________
>>>Geotools-devel mailing list
>>>Geotools-devel@lists.sourceforge.net
>>>https://lists.sourceforge.net/lists/listinfo/geotools-devel
>>>
>>>
>>>

-------------------------------------------------------
This SF.Net email is sponsored by: GNOME Foundation
Hackers Unite! GUADEC: The world's #1 Open Source Desktop Event.
GNOME Users and Developers European Conference, 28-30th June in Norway
http://2004/guadec.org
_______________________________________________
Geoserver-devel mailing list
Geoserver-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/geoserver-devel

--