[Geoserver-devel] PSC: moving control-flow module to extension

Hi,
I would like to propose we move the control-flow extension to
extensions in both 2.0.x and trunk.

Rationale:
- I have been using it for 4 months on demo.opengeo.org/geoserver
   with good success, the box has been quite stable so far and without
   performance issues
- it is documented: http://docs.geoserver.org/stable/en/user/community/controlflow/index.html
- has a maintaner (me)

PSC, what do you think? Btw, shall I do a formal proposal or a mail is
enough? :wink:

Cheers
PSAndrea

--
Andrea Aime
OpenGeo - http://opengeo.org
Expert service straight from the developers.

As I recall you made a proposal earlier; and we decided to let the functionality hang out for a while we got feedback (and confidence) from people using it in the field. Sounds like that has occurred?
+1

Jody

On 03/06/2010, at 11:23 PM, Andrea Aime wrote:

Hi,
I would like to propose we move the control-flow extension to
extensions in both 2.0.x and trunk.

Rationale:
- I have been using it for 4 months on demo.opengeo.org/geoserver
  with good success, the box has been quite stable so far and without
  performance issues
- it is documented:
http://docs.geoserver.org/stable/en/user/community/controlflow/index.html
- has a maintaner (me)

PSC, what do you think? Btw, shall I do a formal proposal or a mail is
enough? :wink:

Cheers
PSAndrea

--
Andrea Aime
OpenGeo - http://opengeo.org
Expert service straight from the developers.

------------------------------------------------------------------------------
ThinkGeek and WIRED's GeekDad team up for the Ultimate
GeekDad Father's Day Giveaway. ONE MASSIVE PRIZE to the
lucky parental unit. See the prize list and enter to win:
http://p.sf.net/sfu/thinkgeek-promo
_______________________________________________
Geoserver-devel mailing list
Geoserver-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/geoserver-devel

sounds good to me +1

On Fri, Jun 4, 2010 at 7:28 AM, Jody Garnett <jody.garnett@anonymised.com> wrote:

As I recall you made a proposal earlier; and we decided to let the functionality hang out for a while we got feedback (and confidence) from people using it in the field. Sounds like that has occurred?
+1

Jody

On 03/06/2010, at 11:23 PM, Andrea Aime wrote:

Hi,
I would like to propose we move the control-flow extension to
extensions in both 2.0.x and trunk.

Rationale:
- I have been using it for 4 months on demo.opengeo.org/geoserver
with good success, the box has been quite stable so far and without
performance issues
- it is documented:
http://docs.geoserver.org/stable/en/user/community/controlflow/index.html
- has a maintaner (me)

PSC, what do you think? Btw, shall I do a formal proposal or a mail is
enough? :wink:

Cheers
PSAndrea

--
Andrea Aime
OpenGeo - http://opengeo.org
Expert service straight from the developers.

------------------------------------------------------------------------------
ThinkGeek and WIRED's GeekDad team up for the Ultimate
GeekDad Father's Day Giveaway. ONE MASSIVE PRIZE to the
lucky parental unit. See the prize list and enter to win:
http://p.sf.net/sfu/thinkgeek-promo
_______________________________________________
Geoserver-devel mailing list
Geoserver-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/geoserver-devel

------------------------------------------------------------------------------
ThinkGeek and WIRED's GeekDad team up for the Ultimate
GeekDad Father's Day Giveaway. ONE MASSIVE PRIZE to the
lucky parental unit. See the prize list and enter to win:
http://p.sf.net/sfu/thinkgeek-promo
_______________________________________________
Geoserver-devel mailing list
Geoserver-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/geoserver-devel

Quick question; is this something we want merged in to the application (rather then an extension) just in terms of safety?

I also wonder how many people try out plugins; I wish we could offer some kind of facility to help administrators download and "install" plugins. I imagine it is difficult with two many different application containers each doing their own thing.

Jody

On 04/06/2010, at 7:28 AM, Jody Garnett wrote:

As I recall you made a proposal earlier; and we decided to let the functionality hang out for a while we got feedback (and confidence) from people using it in the field. Sounds like that has occurred?
+1

Jody

On 03/06/2010, at 11:23 PM, Andrea Aime wrote:

Hi,
I would like to propose we move the control-flow extension to
extensions in both 2.0.x and trunk.

Rationale:
- I have been using it for 4 months on demo.opengeo.org/geoserver
with good success, the box has been quite stable so far and without
performance issues
- it is documented:
http://docs.geoserver.org/stable/en/user/community/controlflow/index.html
- has a maintaner (me)

PSC, what do you think? Btw, shall I do a formal proposal or a mail is
enough? :wink:

Cheers
PSAndrea

--
Andrea Aime
OpenGeo - http://opengeo.org
Expert service straight from the developers.

------------------------------------------------------------------------------
ThinkGeek and WIRED's GeekDad team up for the Ultimate
GeekDad Father's Day Giveaway. ONE MASSIVE PRIZE to the
lucky parental unit. See the prize list and enter to win:
http://p.sf.net/sfu/thinkgeek-promo
_______________________________________________
Geoserver-devel mailing list
Geoserver-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/geoserver-devel

+1 for me.

This could be quite useful for app-schema deployments, because app-schema is a database connection hog (and uses multiple connections to service a single request [naughty, naughty]), so control flow could be used to limit the number of requests to avoid connection pool deadlock in production deployments. Niiice.

On 03/06/10 21:23, Andrea Aime wrote:

Hi,
I would like to propose we move the control-flow extension to
extensions in both 2.0.x and trunk.

Rationale:
- I have been using it for 4 months on demo.opengeo.org/geoserver
    with good success, the box has been quite stable so far and without
    performance issues
- it is documented:
http://docs.geoserver.org/stable/en/user/community/controlflow/index.html
- has a maintaner (me)

PSC, what do you think? Btw, shall I do a formal proposal or a mail is
enough? :wink:

Cheers
PSAndrea

--
Ben Caradoc-Davies <Ben.Caradoc-Davies@anonymised.com>
Software Engineering Team Leader
CSIRO Earth Science and Resource Engineering
Australian Resources Research Centre

Ben Caradoc-Davies ha scritto:

+1 for me.

This could be quite useful for app-schema deployments, because app-schema is a database connection hog (and uses multiple connections to service a single request [naughty, naughty]), so control flow could be used to limit the number of requests to avoid connection pool deadlock in production deployments. Niiice.

Connection pool deadlocks? Those are normally a problem with application
design... let me think about it... you are merging multiple data streams
coming from the same database, and each is using a separate connection,
and that can definitely cause deadlocks (I experienced those last
year during the WMS shootout because the Oracle data store needed, on occasion, a second connection, and boom, server deadlock on higher loads).

Wondering if you can open a Transaction, set it into the feature
sources, so that all of them operating against the same db will
share a single connection. That would improve the scalability
significantly....
Hmm... that's something you can pull off only if you're playing
against stores, pity, that is a significant shortcoming of the
current datastore api imho, also FeatureSource should have a
setTransaction method (for other good reason past the connection
sharing).
For the time being you can still try to set the same transaction
against all the feture sources that happen to be stores at the
same time :wink:

Cheers
Andrea

--
Andrea Aime
OpenGeo - http://opengeo.org
Expert service straight from the developers.

On 04/06/10 14:15, Andrea Aime wrote:

For the time being you can still try to set the same transaction
against all the feture sources that happen to be stores at the
same time :wink:

Thanks, Andrea, that is a very interesting suggestion.

--
Ben Caradoc-Davies <Ben.Caradoc-Davies@anonymised.com>
Software Engineering Team Leader
CSIRO Earth Science and Resource Engineering
Australian Resources Research Centre

On 04/06/2010, at 4:15 PM, Andrea Aime wrote:

Hmm... that's something you can pull off only if you're playing
against stores, pity, that is a significant shortcoming of the
current datastore api imho, also FeatureSource should have a
setTransaction method (for other good reason past the connection
sharing).

I actually tried to request that - since I have the same issue from uDig. It was during
the feature model switch so we were all busy. It *is* however a good idea.

For the time being you can still try to set the same transaction
against all the feture sources that happen to be stores at the
same time :wink:

Agreed this is the workaround I currently employ.
Jody

Jody Garnett ha scritto:

On 04/06/2010, at 4:15 PM, Andrea Aime wrote:

Hmm... that's something you can pull off only if you're playing
against stores, pity, that is a significant shortcoming of the
current datastore api imho, also FeatureSource should have a
setTransaction method (for other good reason past the connection
sharing).

I actually tried to request that - since I have the same issue from uDig. It was during the feature model switch so we were all busy. It *is* however a good idea.

This is a golden time to make it happen. Go! :slight_smile:
(not sure what the side effects will be on implementations will be though... hum... )

Cheers
Andrea

--
Andrea Aime
OpenGeo - http://opengeo.org
Expert service straight from the developers.

Jody Garnett ha scritto:

Quick question; is this something we want merged in to the
application (rather then an extension) just in terms of safety?

We want more people to use it before turning it into a core
module imho

I also wonder how many people try out plugins; I wish we could offer
some kind of facility to help administrators download and "install"
plugins. I imagine it is difficult with two many different
application containers each doing their own thing.

I guess so. Hudson seems to be able to handle that, it downloads the
plugin and then asks you to restart the container manually (or just the
webapp? don't remember).
If someone is interested in doing some research and see how it does
that I guess the results would be instructive for everyone

Cheers
Andrea

--
Andrea Aime
OpenGeo - http://opengeo.org
Expert service straight from the developers.

On 04/06/10 14:15, Andrea Aime wrote:

Wondering if you can open a Transaction, set it into the feature
sources, so that all of them operating against the same db will
share a single connection. That would improve the scalability
significantly....

A more difficult problem is orchestrating multiple filter queries so that a single complex feature WFS response can be built from multiple simple FeatureSources, where the FeatureSources interact to reassemble the filter queries into an efficient join.

The implementation of app-schema builds on the existing simple FeatureSources, one FeatureSource per table/view. This makes navigating relationships very inefficient, because GeoServer has to take the simple features and find their relationship so they can be reassembled into a complex feature, with one as a nested property of the other. This typically involves making a new filter query for each complex property of each enclosing feature (yes, you read that right, feature, not feature type). The volume of the generated SQL can exceed the volume of the encoded response. This query pattern would be more naturally and efficiently represented in SQL by a JOIN on the fields defining the relationship.

So, is it possible to take a complex information model, flatten it into a collection of simple feature types, and reassemble it just in time to make an efficient SQL query appropriate to the complex information model? (And pass the result back through the simple FeatureSources). My gut feeling is no. This is in my view one of the fundamental architectural limitations of GeoServer app-schema. Even thinking about it leads me into wondering whether we end up trying to solve the general object-relational mapping problem.

Andrea, have you got an experimental git local repo that does this? :wink:

--
Ben Caradoc-Davies <Ben.Caradoc-Davies@anonymised.com>
Software Engineering Team Leader
CSIRO Earth Science and Resource Engineering
Australian Resources Research Centre

Ben Caradoc-Davies ha scritto:

On 04/06/10 14:15, Andrea Aime wrote:

Wondering if you can open a Transaction, set it into the feature
sources, so that all of them operating against the same db will
share a single connection. That would improve the scalability
significantly....

A more difficult problem is orchestrating multiple filter queries so that a single complex feature WFS response can be built from multiple simple FeatureSources, where the FeatureSources interact to reassemble the filter queries into an efficient join.

The implementation of app-schema builds on the existing simple FeatureSources, one FeatureSource per table/view. This makes navigating relationships very inefficient, because GeoServer has to take the simple features and find their relationship so they can be reassembled into a complex feature, with one as a nested property of the other. This typically involves making a new filter query for each complex property of each enclosing feature (yes, you read that right, feature, not feature type). The volume of the generated SQL can exceed the volume of the encoded response. This query pattern would be more naturally and efficiently represented in SQL by a JOIN on the fields defining the relationship.

The obvious better way it to implement native joins in the data stores,
and that will help you in the common case of all data sources being
stored in the same DB.

However, it's possible to make a join with only two queries (yes, only two) under certain conditions:
- no external sorting is required
- both sides support sorting
- it's a 1-n join

In that case you query both sources sorting on the joining key (assuming
the usual fk driven equijoin) and scan over them pairing the keys
and moving forward the 1 collection only when the n collection joining
key changes.
Well... not the best explanation, but I hope I got the message though.

So, is it possible to take a complex information model, flatten it into a collection of simple feature types, and reassemble it just in time to make an efficient SQL query appropriate to the complex information model? (And pass the result back through the simple FeatureSources). My gut feeling is no.

Query flattening is one of the Hibernate strategies actually.
You have N tables to query, it makes a massive outer join style query
with proper sorting and then scans that single result to build
an in memory object tree. Same reasoning as above, but requires joining
(outer join style) in the data stores to be carried out.

This is in my view one of the fundamental architectural limitations of GeoServer app-schema. Even thinking about it leads me into wondering whether we end up trying to solve the general object-relational mapping problem.

Andrea, have you got an experimental git local repo that does this? :wink:

Nope.

Cheers
Andrea

--
Andrea Aime
OpenGeo - http://opengeo.org
Expert service straight from the developers.