[Geoserver-devel] GSIP 69 - Catalog scalability enhancements - OGC Filters VS predicate

Hum… the thread is getting long and mails deal with many topics, let met try to
split this into separate sub-threads.
This one is about filters, paging and sorting.

About sorting I believe we are all on the same page, my suggestions
about checking for fast sorting was just a random idea anyways,
don’t see any strong need to see it implemented.

About the topic of the predicate API being simpler than the OGC one… yes,
it has less filters, which actually makes it less useful.
About it being harder to build filters, I don’t see it.
The filter building styles of both goes through a factory with short
named methods and arguably OGC allows for CQL expressions to
be used if that is perceived to be simpler.

I don’t see many people implementing lots of catalog subsystem implementations,
and those people will likely deal with GeoServer in other ways so they will
have a passing familiarity with OGC filter concepts already.
Having to learn another API actually makes things more confusing, you
need to remember what each API does and how.

Encoding wise, we already have a lot of code that allows to split filter and encode
them in SQL and other languages, which means we have examples on
how things are done. Filter splitters do not make any use of the feature
type, they only know about the filter types listed in a filter capabilities object
(and they can be subclassed to allow more targeted checks),
and filter encoders are something you need to prime often with a feature type,
while in this case you’ll have to prime them with the bean class.
I don’t see the difference nor the difficulty.

Lack of spatial filter is problematic because of possible
CSW implementations, but also for security subsystems that often
express spatial constraints on the data (and layers) you can actually
see, not having an efficient way to make them run looks like a serious
drawback to me.

But also think about the case in which you are not doing multitenancy,
but you do have tons of layers. I know of one installation in Italy at
a research center that had, one year ago, 160k layers registered,
with new ones showing up every day.
Think how useful it would be for a case like this to be be able to pass down
a CQL filter on the GetCapabilies to get a more focused capabilities document
for WMS/WFS/WCS usage.

About the GUI filter that is not implementable efficiently in Predicate,
I did not notice “contains” is a well known filter, so sorry about that one,
reviewing the whole work in just half a day means I could not actually read
everything line by line (and often I had difficulties understanding what the
code did, see also the other mail about catalog implementations, but
generally speaking the proposal was rich in terms of describing api and
architectural concepts and poor in terms of describing how things are done,
which is equally important for something that aims at being committed).

About the relationship between catalog and data stores I don’t want in
any way impose it, nor I want to have abstraction layers be broken,
I simply recognize OGC filter as a generic data access filtering API
while it seems that you see it as something specific to GeoTools… but
it’s not, it’s not meant to be, using property extractors you can actually
have it filter whatever, features, beans, hashmaps, spatial stuff and
non spatial one.

The thing could then go two ways. An easy nice to have is to
be able to build a store on top of the catalog that would allow
to display WMS maps of where the layers are, and search over
the catalog via simple WFS (which can be a nice way to allow
someone that wants to search into the server without having to
build or use a CSW client). In both cases having spatial filters would
be really handy, but in general the richness of OGC filter would
allow for complex searches to be made fast.

The other direction is to be able to build a catalog the other way, that is,
build it on top of a data store. Now, I’m not sure it would be great, and
we may not want to use it, but it would likely make for a quick to
implement spatially searchable catalog.
(mind, this is not the strong argument, I’m actually just thinking out loud,
the strong argument is that the filter API is good, tested, well known,
rich and flexible, Predicate is none of that, the rest of the arguments
are just topping on the cake).
Feature types could be created by flattening the objects and creating
the feature type by reflection. Of course this would break the moment
new attributes show up, but we can call updateSchema() to add those, right?
It would also make for a more “relational” setup on DBMS storage, which
many people would feel more comfortable with, and would make it easier
for other applications to directly edit the persisted catalog (something I’m
sure many people will want to do).

Cheers
Andrea

Thanks for splitting out the threads. I am starting to like this idea the
more I think about it and you make a lot of great arguments. More comments
inline.

On Sat, Apr 28, 2012 at 1:47 PM, Andrea Aime
<andrea.aime@anonymised.com>wrote:

Hum... the thread is getting long and mails deal with many topics, let met
try to
split this into separate sub-threads.
This one is about filters, paging and sorting.

About sorting I believe we are all on the same page, my suggestions
about checking for fast sorting was just a random idea anyways,
don't see any strong need to see it implemented.

About the topic of the predicate API being simpler than the OGC one...
yes,
it has less filters, which actually makes it less useful.
About it being harder to build filters, I don't see it.
The filter building styles of both goes through a factory with short
named methods and arguably OGC allows for CQL expressions to
be used if that is perceived to be simpler.

Right, my point is to look at what it takes to build a filter with straight
factories. Take a relatively non complex filter:

FilterFactory2 factory = (FilterFactory2)
CommonFactoryFinder.getFilterFactory(null);

PropertyIsLike f1 = factory.like(factory.property("name"), "*roads*");
Intersects f2 = factory.intersects(factory.literal("latLongBoundingBox"
), factory.literal(new Envelope(-180,-90,180,90)));
Filter f = factory.and(f1, f2);
I mean is it doable, yes. It is convenient, imo no. What I would long for
is something like:

Filter f = new FilterBuilder().like("name", "roads").and().intersects(
"latLonBoundingBox", new Envelope(-180,-90,180,90))

Obviously CQL is a great shortcut... but if the filter is not totally
static (in that you need to plugin in variable names, etc..) it involves a
lot of string concatenation involving quotes, etc... same deal as we have
for building sql strings which is one thing that personally drives me crazy
and i never get right due to some missing quote or something.
And then take into account functions which is the mechanism to do filtering
where the predicate is not modeled in the core language. The function
syntax requires things to be looked up by name, which requires a function
factory, etc... a lot of overhead. What I would like to be able to do is
just write a straight java function or anonymous inner class and have the
rest be handled for me.

Anyways, sorry this is getting more into usability of the filter api and
away from the original point.

I don't see many people implementing lots of catalog subsystem
implementations,
and those people will likely deal with GeoServer in other ways so they will
have a passing familiarity with OGC filter concepts already.
Having to learn another API actually makes things more confusing, you
need to remember what each API does and how.

Encoding wise, we already have a lot of code that allows to split filter
and encode
them in SQL and other languages, which means we have examples on
how things are done. Filter splitters do not make any use of the feature
type, they only know about the filter types listed in a filter
capabilities object
(and they can be subclassed to allow more targeted checks),
and filter encoders are something you need to prime often with a feature
type,
while in this case you'll have to prime them with the bean class.
I don't see the difference nor the difficulty.

Lack of spatial filter is problematic because of possible
CSW implementations, but also for security subsystems that often
express spatial constraints on the data (and layers) you can actually
see, not having an efficient way to make them run looks like a serious
drawback to me.

But also think about the case in which you are not doing multitenancy,
but you do have tons of layers. I know of one installation in Italy at
a research center that had, one year ago, 160k layers registered,
with new ones showing up every day.
Think how useful it would be for a case like this to be be able to pass
down
a CQL filter on the GetCapabilies to get a more focused capabilities
document
for WMS/WFS/WCS usage.

Yeah, really cool stuff, kind of outside the original use case that started
the proposal but something I think should be brought in and something made
a lot easier if we do go with geotools filters for the predicate language.

About the GUI filter that is not implementable efficiently in Predicate,
I did not notice "contains" is a well known filter, so sorry about that
one,
reviewing the whole work in just half a day means I could not actually read
everything line by line (and often I had difficulties understanding what
the
code did, see also the other mail about catalog implementations, but
generally speaking the proposal was rich in terms of describing api and
architectural concepts and poor in terms of describing how things are done,
which is equally important for something that aims at being committed).

About the relationship between catalog and data stores I don't want in
any way impose it, nor I want to have abstraction layers be broken,
I simply recognize OGC filter as a generic data access filtering API
while it seems that you see it as something specific to GeoTools... but
it's not, it's not meant to be, using property extractors you can actually
have it filter whatever, features, beans, hashmaps, spatial stuff and
non spatial one.

Well specific enough that you need most of the geotools library to be able
to use it, but yes not a problem for geoserver. And I can't imagine the non
feature case is really battle tested at the point that there are literally
zero assumptions about feature objects. But those are implementation
details, to be dealt with later.

The thing could then go two ways. An easy nice to have is to
be able to build a store on top of the catalog that would allow
to display WMS maps of where the layers are, and search over
the catalog via simple WFS (which can be a nice way to allow
someone that wants to search into the server without having to
build or use a CSW client). In both cases having spatial filters would
be really handy, but in general the richness of OGC filter would
allow for complex searches to be made fast.

Very cool idea. Building a catalog without csw. Very clever.

The other direction is to be able to build a catalog the other way, that
is,
build it on top of a data store. Now, I'm not sure it would be great, and
we may not want to use it, but it would likely make for a quick to
implement spatially searchable catalog.
(mind, this is not the strong argument, I'm actually just thinking out
loud,
the strong argument is that the filter API is good, tested, well known,
rich and flexible, Predicate is none of that, the rest of the arguments
are just topping on the cake).
Feature types could be created by flattening the objects and creating
the feature type by reflection. Of course this would break the moment
new attributes show up, but we can call updateSchema() to add those, right?
It would also make for a more "relational" setup on DBMS storage, which
many people would feel more comfortable with, and would make it easier
for other applications to directly edit the persisted catalog (something
I'm
sure many people will want to do).

Also an interesting idea. No real comment other than I think it is cool.

Cheers
Andrea

--
Justin Deoliveira
OpenGeo - http://opengeo.org
Enterprise support for open source geospatial.

Yep, that would be better. I tried to start going down that path when I made the StyleBuilder, but it was not that easy because
of how the filters are nested.
In style builder I could break out a tree of builders so that you could build a complex style without having to get crazy
and going up and down the nesting with a single builder call chain (though the single builder call chain is
available for simple styles. See the following:

        RuleBuilder rb = new RuleBuilder();
        rb.point().graphic().size(6).mark().name("circle").fill().color(Color.RED);
        TextSymbolizerBuilder tb = rb.text().label("name");
        tb.fill().color(Color.BLACK);
        tb.newFont().familyName("Arial").size(12).styleName(Font.Style.NORMAL)
                .weightName(Font.Weight.BOLD);
        tb.pointPlacement().displacement().x(0).y(5);
        tb.pointPlacement().anchor().x(0.5);
        Style style = rb.buildStyle();
However I was not sure how to do something like that with Filter, where the level of nesting is quite a bit
higher and less predictable (thining about handling and/or, nested expressions and function calls).

Yep, agree that the current property extractors can be lacking. Something that either BeanUtils (if we want to stick
with the a.b.c[3].f notation) or some xpath support should handle for sure (again, for example, JXPath,
http://commons.apache.org/jxpath/users-guide.html)

Cheers
Andrea

···

Right, my point is to look at what it takes to build a filter with straight factories. Take a relatively non complex filter:

FilterFactory2 factory = (FilterFactory2) CommonFactoryFinder.getFilterFactory(null);

PropertyIsLike f1 = factory.like(factory.property(“name”), “roads”);
Intersects f2 = factory.intersects(factory.literal(“latLongBoundingBox”), factory.literal(new Envelope(-180,-90,180,90)));
Filter f = factory.and(f1, f2);

I mean is it doable, yes. It is convenient, imo no. What I would long for is something like:

Filter f = new FilterBuilder().like(“name”, “roads”).and().intersects(“latLonBoundingBox”, new Envelope(-180,-90,180,90))

Well specific enough that you need most of the geotools library to be able to use it, but yes not a problem for geoserver. And I can’t imagine the non feature case is really battle tested at the point that there are literally zero assumptions about feature objects. But those are implementation details, to be dealt with later.

Hi Andrea, all.,

first off, sorry it took me so long to reply to this important thread,
and thanks Andrea again for splitting out the discussion into separate
topics

On Sat, Apr 28, 2012 at 10:47 AM, Andrea Aime
<andrea.aime@anonymised.com> wrote:

Hum... the thread is getting long and mails deal with many topics, let met
try to
split this into separate sub-threads.
This one is about filters, paging and sorting.

About sorting I believe we are all on the same page, my suggestions
about checking for fast sorting was just a random idea anyways,
don't see any strong need to see it implemented.

yep, already decided to get rid of Catalog.canSort() based on feedback.

You make some good points, lets go to the most important one first,
Predicate vs OGC Filter:

To start with, I do am willing to concede on using OGC Filter instead
of the "domain specific" Predicate interface. Have a couple usability
concerns though, more about this later.

I do know it is a general purpose filter predicate language, coming
from GeoTools, which we already depend on. When I mention it as a
dependency I mean at the architectural level, not at the classpath
level. And also know from experience that our Filter implementation
works on non-feature object models, given appropriate property
extractors. If we go that route, we'll probably need a Catalog domain
specific (i.e., *Info) property extractor.

I do understand OGC Filter is richer and more expressive than the
(purposedly) limited number of "well known" Predicate idioms. More
about rationale later.

Your argument about ease of encoding Filter because we already have
code to do that, applies, AFAIK, to a specific data model (flat
table), at least in practice, except if you want to use app-schema. It
is not a bad argument, just falling on the implementation detail side
of the fence. I faithfuly assume you value the decoupling of the
interface from the underlying data model, so nothing to add here.
About the availability of filter splitters, that's indeed a very nice
bonus point for the OGC Filter choice.

The argument about lack of spatial filtering in Predicate is
debatible. The reason there are no well known spatial predicates in
the proposal is because the set of proposed well known constructs is
strictly limited to fit the current use of Catalog. That is, equals,
"iLike", and, or, exists, isNull. Nothing impedes adding spatial
constructs once the real need for them arise, but if you want them
executed by the storage engine, then that also meants imposing a new
requirement on the catalog backend, to support spatial queries, which
currently is not a requirement. If, by the other hand, you want
spatial filters without imposing the backend to support spatial, then
it's just as easy to create a predicate as an anonymous inner class.

In order to make this as short as possible: a possible CSW
implementation on top of the GeoServer Catalog, ability to specify a
CQL filter as a GetCapabilities parameter, ability to wrap the Catalog
as a DataStore and hence draw maps of where the layers are and expose
that information through WFS, are all really nice ideas. I want to
make clear I'm not against any of those, and actually encourage that
kind of feedback whenever a new proposal is made, since that's how we
as a community ensure our product serves everyone's needs. On the
other side, the best way we know so far to make things happen, is to
tie ourselves to an iterative and incremental development process. The
focus of GSIP69 is to solve today's GeoServer scalability problems.
Nonetheless, this kind of feedback is valuable in terms of planning
ahead for extensibility. So thanks for bringing up those points. That
said, I don't really see any of those new feature ideas as blockers
for having a domain specific query constructs for the Catalog. But
think it's pointless to develop too deep on each of them right now,
but acknowledge the current infrastructure to work with OGC Filters
would be convenient, instead of translating CQL to Predicate,
regardless of not being hard at all, indeed looks like unnecessary
duplication.

So, down to the core of this discussion, what we're essentially
discussing about is a design decision. I am sorry it seemed like the
one I made was lightly taken. It was not, which doesn't mean it was
the most accurate either. I seriously considered our already available
OGC Filter implementation as the first choice, as it seems to fit
naturally, or rather easily.
Please note though, the proposal is on "under discussion" status, so
that's exactly what should be happening. Hence I'm glad we're doing
so, ad I'd like to see this kind of discussions as I found myself
often looking at bad smelling pieces of code throughout our codebase.
Including, of course, my own code.

All that said, I think it's turn to weight in both options so we can
take a decision knowing what the benefits and drawbacks are for each
one.

The following is _my_ current thinking on both approaches. Feel free
to complete/correct with your own. I know everything is debatible,
just trying to figure out a sensible set of pros and cons so that we
can make an informed decision.

Option #1, use OGC Filter as the Catalog query model
----------------------------------------------------
Rationale: convenience re existing infrastructure, familiarity, high
expresiveness, hability to create complex filters.

Benefits

- Code reuse: GeoTools' OGC Filter implementation is in wide use by
the Data Access APIs, meaning there are a lot of utilities to deal
with them already, like filter splitters.

- Familiarity: most developers that work with the GeoServer Catalog,
are probably already familiar with the GeoTools Filter API.

- Wide range of ready to use filter constructs: almost everything you
can translate to a SQL where clause is in there.

Caveats

- Difficult extensibility. The way to extend filter funtionality is by
creating custom Functions. There are cases where the required filter
can't be expressed using the prescribed OGC Filter idioms.

     FilerFactory ff = CommonFactoryFinder.getFilterFactory(null);
     Filter enabledLayers = ff.equals(ff.propertyName("enabled"),
ff.literal(Boolean.TRUE));
     Filter brokenLayers = ff.and(enabledLayers, ???);
Here we want brokenLayers to be a filter that returns all layers that
are enabled by configuration, but broken or disabled by cascading,
using the derived enabled() property. This is not possible wihout
registering a function factory with a function specific for that
purpose. But doing so would make it available globally, whereas it's
domain specific. If, by the other hand, you simple get the layers
enabled by configuration (i.e LayerInfo.isEnabled()), and then iterate
over them on client code and check every one fo the the derived
enabled() property, you lose the ability of executing the predicate
back in the chain, unnecesarily making all the Catalog wrappers to
create wrapper objects for them.

     Any non natively encodable predicate suffers from this issue.
Another example is the security filtering on SecureCatalogImpl. This
filtering is based on externaly configured constraints that may or may
not be translated to a "natively encodable" query predicate. Yet, the
logic may change over time. The proposed approach builds an custom
predicate that is evaluated in-process, with the important
characteristic of being pushed back to be evaluated before being
returned to the calling code. So even if its not "natively encodable",
it also has the intended effect of being processed at the bottom of
the call chain, avoiding any catalog wrapper (including
SecureCatalogImpl) to create object wrappers for results that are then
to be discarded, hence lowering the memory consumption and GC
overhead.

- Deviation from simple property filtering: some filters are not so
straight forward. For example, querying for simple properties of
multivalued properties would require some sort of XPath syntax. How
well frameworks like JXPath fit into our object model is to be
assessed. We have steady and dynamic properties (through MatadataMap).
Using a custom propertyExtractor that changes the regular OGC Filter
property addressing, to fit our desire for simplicity, would lead to
confussion, whilst with our own query model we can make is straight
forward. A filter like "styles.id = 'someid'" in the proposed query
model is as simple as that, an OGC Filter would ma

- Pandora's box: There's so much functionality in OGC Filter that is
not proved against the Catalog object model, that ensuring proper
functioning of each and every possible filter construct may require a
significant effort, opening the door for random bug reports over
untested code paths.

- Implementation constraints: it is considered a desirable feature for
the API to impose as little constraints on implementations as
possible. The usage of Catalog and hence its query needs so far are
rather simple: find by id, find by name, and little more. Using OGC
Filter either imposes the backend to be able of taking care of a lot
of filter constructs that might never see real use, or rather
executing them mostly in-process, that the

Option #2, define a GeoServer Catalog's own query model
-------------------------------------------------------
Rationale: architectural consistency, cohesion, easy of use, extensibility.

Benefits

- Easy extensibility: no need to register extra factories, but using
simple anonymous inner classes to create ad-hoc predicates. Example:
   Predicate<LayerInfo> enabledLayers = propertyEquals("enabled", true)
   Predicate<LayerInfo> brokenLayers = and(enabledLayers, new
Predicate<LayerInfo>(){
        @Override
        public boolean apply(LayerInfo layer){
           return layer.enabled(); // <------ note the use of the
derived enabled() property instead of the POJO isEnabled() one
        }
    });

- Easy of use: avoid casting everywhere. We're working with
CatalogInfo and derivatives, so lets make use of modern language
constructs.

- Scope and feature creep contention[1]: by limiting the number of
well known predicates to the minimum indispensable we keep in control
of what can be done and how. And hence try to impose as little
implementation contrainst as possible, and keep focus on what the
Catalog is for. Nothing impedes new features to be built that use or
are based on the Catalog objects. But adding too many features and
increasing scope just in case can lead to unnecesary complexity and
hurt maintainability. Another example of this is the recent move from
GeoWebCache's tile layer configuration out of the Catalog objects
metadata map: initally it seemed convenient to (ab)use the LayerInfo
and LayerGroupInfo metadata maps to hold the related tiled layer
configuration. As complexity of the gwc integration configuration
grew, the approach started to show its drawbacks: catalog object
configuration flooding, lack of quick ways to get only the layers that
do have an associated tile layer, and more. Moving away from that
model and having the integrated GWC maintain its own set of
configuration objects, although still depending on the available
GeoServer catalog layers, eliminated the added complexity on catalog
configuration and avoided having to extend it or modify it just to
serve an orthogonal concern.

- Less implementation constraints: it was and still is a driving
principle to impose as little implementation constrants on catalog
backends as possible. Given the proposal targets catalog scalability
and there's more than one way of pealing a cat.

[1]
http://en.wikipedia.org/wiki/Scope_creep
http://en.wikipedia.org/wiki/Feature_creep

Caveats

- Yet another query predicate: although it's meant to be really
straightforward (there's no much to debate about the meaning of
equals, isNull, contains (a.k.a iLike), and, and or), we need to
recognize it is just yet another query predicate "language".

- Limited set of well known predicate idioms: although the proponent
thinks of this as a feature and not a bug, an argument can be made the
other way around, depending on which characteristics you value the
most, and where you want to draw the application boundaries
architecture wise.

------------------------------

All this said, I re-enforce this is a design decision I'm willing to
concede on and switch to OGC Filter. The only real blocker IMHO is the
inability to easily extend in-place, but having to use Catalog
specific function factories, flooding the general purpose filters with
catalog specific ones, or rather having to give up execution of
predicates on the backend and being forced to iterate (over a lot of
objects) in place, apply the in-process filtering on the client code,
and be exposed to unnecesary wrapping from catalog decorators.

Some other random thoughts in line.

About the topic of the predicate API being simpler than the OGC one... yes,
it has less filters, which actually makes it less useful.

My position is it makes it as useful as needed.

About it being harder to build filters, I don't see it.
The filter building styles of both goes through a factory with short
named methods and arguably OGC allows for CQL expressions to
be used if that is perceived to be simpler.

hmmm.. at a first glance, yes, it looks simpler. And indeed CQL is a
nice terse way of creating a Filter on test cases.
When it comes to actual application code, where the inputs are
dynamically obtained from user input, it's not so. You'd need to deal
with string concatenation and proper parameter escaping to make sure
the resulting CQL is well formed.

I don't see many people implementing lots of catalog subsystem
implementations,
and those people will likely deal with GeoServer in other ways so they will
have a passing familiarity with OGC filter concepts already.
Having to learn another API actually makes things more confusing, you
need to remember what each API does and how.

I see your point, but think it has little applicability. We're not
changing the meaning of equals, isNull, exists, contains (can rename
it iLike looks more familiar), and, and or.

Encoding wise, we already have a lot of code that allows to split filter and
encode
them in SQL and other languages, which means we have examples on
how things are done. Filter splitters do not make any use of the feature
type, they only know about the filter types listed in a filter capabilities
object
(and they can be subclassed to allow more targeted checks),
and filter encoders are something you need to prime often with a feature
type,
while in this case you'll have to prime them with the bean class.
I don't see the difference nor the difficulty.

Lack of spatial filter is problematic because of possible
CSW implementations, but also for security subsystems that often
express spatial constraints on the data (and layers) you can actually
see, not having an efficient way to make them run looks like a serious
drawback to me.

But also think about the case in which you are not doing multitenancy,
but you do have tons of layers. I know of one installation in Italy at
a research center that had, one year ago, 160k layers registered,
with new ones showing up every day.
Think how useful it would be for a case like this to be be able to pass down
a CQL filter on the GetCapabilies to get a more focused capabilities
document
for WMS/WFS/WCS usage.

About the GUI filter that is not implementable efficiently in Predicate,
I did not notice "contains" is a well known filter, so sorry about that one,
reviewing the whole work in just half a day means I could not actually read
everything line by line (and often I had difficulties understanding what the
code did, see also the other mail about catalog implementations, but
generally speaking the proposal was rich in terms of describing api and
architectural concepts and poor in terms of describing how things are done,
which is equally important for something that aims at being committed).

Good advise. I'll try to point out how things are implemented on the
proposal wherever it's appropriate. A point can be made that the
community module is not strictly part of the proposal, but an API
validation AIM, as we're not proposing to replace the default catalog
by it. But in any case I see what you mean and seems valuable feedback
to me.

Cheers,
Gabriel.

About the relationship between catalog and data stores I don't want in
any way impose it, nor I want to have abstraction layers be broken,
I simply recognize OGC filter as a generic data access filtering API
while it seems that you see it as something specific to GeoTools... but
it's not, it's not meant to be, using property extractors you can actually
have it filter whatever, features, beans, hashmaps, spatial stuff and
non spatial one.

The thing could then go two ways. An easy nice to have is to
be able to build a store on top of the catalog that would allow
to display WMS maps of where the layers are, and search over
the catalog via simple WFS (which can be a nice way to allow
someone that wants to search into the server without having to
build or use a CSW client). In both cases having spatial filters would
be really handy, but in general the richness of OGC filter would
allow for complex searches to be made fast.

The other direction is to be able to build a catalog the other way, that is,
build it on top of a data store. Now, I'm not sure it would be great, and
we may not want to use it, but it would likely make for a quick to
implement spatially searchable catalog.
(mind, this is not the strong argument, I'm actually just thinking out loud,
the strong argument is that the filter API is good, tested, well known,
rich and flexible, Predicate is none of that, the rest of the arguments
are just topping on the cake).
Feature types could be created by flattening the objects and creating
the feature type by reflection. Of course this would break the moment
new attributes show up, but we can call updateSchema() to add those, right?
It would also make for a more "relational" setup on DBMS storage, which
many people would feel more comfortable with, and would make it easier
for other applications to directly edit the persisted catalog (something I'm
sure many people will want to do).

Cheers
Andrea

--
Gabriel Roldan
OpenGeo - http://opengeo.org
Expert service straight from the developers.

On Thu, May 3, 2012 at 6:12 PM, Gabriel Roldan <groldan@anonymised.com> wrote:

All this said, I re-enforce this is a design decision I’m willing to
concede on and switch to OGC Filter. The only real blocker IMHO is the
inability to easily extend in-place, but having to use Catalog
specific function factories, flooding the general purpose filters with
catalog specific ones, or rather having to give up execution of
predicates on the backend and being forced to iterate (over a lot of
objects) in place, apply the in-process filtering on the client code,
and be exposed to unnecesary wrapping from catalog decorators.

Read it all, I agree more or less with the pro/cons statements but not with
the value judgements, to me the pros in using OGC filter model far
overwheight the points in favor of the Predicate and the opposite happens
for the limitations.
I guess we can agree to disagree, and we’ll need the rest of the PSC
to act as a tie breaker.

About the ability to extract properties out of the catalog objects, I believe we
can get 90% there using BeanUtils inside a property extractor that activates when
the object to be evaluated is a CatalogInfo, in fact the syntax you are using today to
specify the nested property names is the same as the BeanUtils one (which is, btw,
already in the classpath).

About the need to have to roll a lot of filter functions… some of the things we need
can be managed as properties using custom property extractors, as for the
others I’m wondering if we cannot just roll anonymous filter functions.
The reasons to have registered functions are:

  • being able to turn them into CQL/XML. Something we don’t need, we would
    not be able to do so with Predicate anyways
  • deep clone filters, again, something we won’t be able to do with Predicate anyways

Soo… what’s preventing us from having a PredicateFunction base class that lets
us generate a new filter function implementation inline as follows:

new PredicateFunction() {
Object evaluate(Object feature) {
return ((ResourceInfo)).enabled();
}
}

Instead of rolling a whole new API and code set we’d just have to roll some documentation
bits on how the catalog subsystem manages the OGC filters.
To me it’s so much better it’s a no brainer, although I understand you have a different set
of values and thus a different opinion.

Cheers
Andrea


Ing. Andrea Aime
GeoSolutions S.A.S.
Tech lead

Via Poggio alle Viti 1187
55054 Massarosa (LU)
Italy

phone: +39 0584 962313
fax: +39 0584 962313
mob: +39 339 8844549

http://www.geo-solutions.it
http://geo-solutions.blogspot.com/
http://www.youtube.com/user/GeoSolutionsIT
http://www.linkedin.com/in/andreaaime
http://twitter.com/geowolf

On Thu, May 3, 2012 at 2:02 PM, Andrea Aime
<andrea.aime@anonymised.com> wrote:

On Thu, May 3, 2012 at 6:12 PM, Gabriel Roldan <groldan@anonymised.com> wrote:

All this said, I re-enforce this is a design decision I'm willing to
concede on and switch to OGC Filter. The only real blocker IMHO is the
inability to easily extend in-place, but having to use Catalog
specific function factories, flooding the general purpose filters with
catalog specific ones, or rather having to give up execution of
predicates on the backend and being forced to iterate (over a lot of
objects) in place, apply the in-process filtering on the client code,
and be exposed to unnecesary wrapping from catalog decorators.

Read it all, I agree more or less with the pro/cons statements but not with
the value judgements, to me the pros in using OGC filter model far
overwheight the points in favor of the Predicate and the opposite happens
for the limitations.
I guess we can agree to disagree, and we'll need the rest of the PSC
to act as a tie breaker.

About the ability to extract properties out of the catalog objects, I
believe we
can get 90% there using BeanUtils inside a property extractor that activates
when
the object to be evaluated is a CatalogInfo, in fact the syntax you are
using today to
specify the nested property names is the same as the BeanUtils one (which
is, btw,
already in the classpath).

About the need to have to roll a lot of filter functions... some of the
things we need
can be managed as properties using custom property extractors, as for the
others I'm wondering if we cannot just roll anonymous filter functions.
The reasons to have registered functions are:
- being able to turn them into CQL/XML. Something we don't need, we would
not be able to do so with Predicate anyways
- deep clone filters, again, something we won't be able to do with Predicate
anyways

Soo... what's preventing us from having a PredicateFunction base class that
lets
us generate a new filter function implementation inline as follows:

new PredicateFunction() {
Object evaluate(Object feature) {
return ((ResourceInfo)).enabled();
}
}

I can live with that. The "breakage" of the OGC Filter usage pattern
would be confined to the org.geoserver.catalog package so... as long
as everybody is happy with that, it's ok to me.
Glad we're getting to an agreement.

Cheers,
Gabriel

Instead of rolling a whole new API and code set we'd just have to roll some
documentation
bits on how the catalog subsystem manages the OGC filters.
To me it's so much better it's a no brainer, although I understand you have
a different set
of values and thus a different opinion.

Cheers
Andrea

--
Ing. Andrea Aime
GeoSolutions S.A.S.
Tech lead

Via Poggio alle Viti 1187
55054 Massarosa (LU)
Italy

phone: +39 0584 962313
fax: +39 0584 962313
mob: +39 339 8844549

http://www.geo-solutions.it
http://geo-solutions.blogspot.com/
http://www.youtube.com/user/GeoSolutionsIT
http://www.linkedin.com/in/andreaaime
http://twitter.com/geowolf

--
Gabriel Roldan
OpenGeo - http://opengeo.org
Expert service straight from the developers.

To be honest, if we have already a good mechanism to do filtering in place I would prefer to re-use the existing one, unless we envisage a big “stop” due to it’s capabilities or performances or whatelse.

Regards,
Alessio.


Ing. Alessio Fabiani
Founder / CTO GeoSolutions S.A.S.

GeoSolutions S.A.S.
Via Poggio alle Viti 1187
55054 Massarosa (LU)
Italy

phone: (+39) 0584 96.23.13
fax: (+39) 0584 96.23.13
mobile:(+39) 331 62.33.686

http://www.geo-solutions.it
http://geo-solutions.blogspot.com
http://www.linkedin.com/in/alessiofabiani
https://twitter.com/alfa7961
http://twitter.com/geosolutions_it

On Thu, May 3, 2012 at 7:21 PM, Gabriel Roldan <groldan@anonymised.com> wrote:

On Thu, May 3, 2012 at 2:02 PM, Andrea Aime
<andrea.aime@anonymised.com> wrote:

On Thu, May 3, 2012 at 6:12 PM, Gabriel Roldan <groldan@anonymised.com> wrote:

All this said, I re-enforce this is a design decision I’m willing to
concede on and switch to OGC Filter. The only real blocker IMHO is the
inability to easily extend in-place, but having to use Catalog
specific function factories, flooding the general purpose filters with
catalog specific ones, or rather having to give up execution of
predicates on the backend and being forced to iterate (over a lot of
objects) in place, apply the in-process filtering on the client code,
and be exposed to unnecesary wrapping from catalog decorators.

Read it all, I agree more or less with the pro/cons statements but not with
the value judgements, to me the pros in using OGC filter model far
overwheight the points in favor of the Predicate and the opposite happens
for the limitations.
I guess we can agree to disagree, and we’ll need the rest of the PSC
to act as a tie breaker.

About the ability to extract properties out of the catalog objects, I
believe we
can get 90% there using BeanUtils inside a property extractor that activates
when
the object to be evaluated is a CatalogInfo, in fact the syntax you are
using today to
specify the nested property names is the same as the BeanUtils one (which
is, btw,
already in the classpath).

About the need to have to roll a lot of filter functions… some of the
things we need
can be managed as properties using custom property extractors, as for the
others I’m wondering if we cannot just roll anonymous filter functions.
The reasons to have registered functions are:

  • being able to turn them into CQL/XML. Something we don’t need, we would
    not be able to do so with Predicate anyways
  • deep clone filters, again, something we won’t be able to do with Predicate
    anyways

Soo… what’s preventing us from having a PredicateFunction base class that
lets
us generate a new filter function implementation inline as follows:

new PredicateFunction() {
Object evaluate(Object feature) {
return ((ResourceInfo)).enabled();
}
}

I can live with that. The “breakage” of the OGC Filter usage pattern
would be confined to the org.geoserver.catalog package so… as long
as everybody is happy with that, it’s ok to me.
Glad we’re getting to an agreement.

Cheers,
Gabriel

Instead of rolling a whole new API and code set we’d just have to roll some
documentation
bits on how the catalog subsystem manages the OGC filters.
To me it’s so much better it’s a no brainer, although I understand you have
a different set
of values and thus a different opinion.

Cheers
Andrea


Ing. Andrea Aime
GeoSolutions S.A.S.
Tech lead

Via Poggio alle Viti 1187
55054 Massarosa (LU)
Italy

phone: +39 0584 962313
fax: +39 0584 962313
mob: +39 339 8844549

http://www.geo-solutions.it
http://geo-solutions.blogspot.com/
http://www.youtube.com/user/GeoSolutionsIT
http://www.linkedin.com/in/andreaaime
http://twitter.com/geowolf


Gabriel Roldan
OpenGeo - http://opengeo.org
Expert service straight from the developers.


Live Security Virtual Conference
Exclusive live event will cover all the ways today’s security and
threat landscape has changed and how IT managers can respond. Discussions
will include endpoint security, mobile security and the latest in malware
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/


Geoserver-devel mailing list
Geoserver-devel@anonymised.comsts.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/geoserver-devel