[Geoserver-devel] dependencies among complex feature types

Hi all,

Continuing from the thread about improving WFS performance I need to get some input from the appschema experts.

Since the appschema extension supports relationships among feature types there are now dependencies between feature types. My question is is there any way know what the dependencies are?

The reason I ask relates to trying to prevent building a schema for encoding purposes that contains all the types in the catalog. Consider doing a getfeature request against type X where type X has attributes that are instances of features of type Y from another namespace.

Consider a getFeature request against type X. Now if type X was simple there would be no link and I could simply build a schema that contains a single complex type and everything works great. But in the complex case I also need to build that schema so that it imports/includes/incorporates type Y as well.

So again the question is how do I infer this dependency given a FeatureTypeInfo object representing X? Is this information buried within the depths of the appschema datastore?

-Justin

--
Justin Deoliveira
OpenGeo - http://opengeo.org
Enterprise support for open source geospatial.

That was a good post Justin. Enjoyed readin it to the end.

Without speaking for Ben and Rini, who now has a much deeper
understanding of the way schemas are used in app-schema, I will
comment on the underlying requirements.

Firstly, the application schema imports all its dependencies
explicitly. Implementations of a particular feature type do not
necessarily, however, exercise all imports. Some imports are however
required as base classes, and others for attribute types. So you cant
just look at a feature type configuration and determine the
dependencies - you really do need to look at the application scheam.

The pattern for a Feature relationship allows "inlineOrByReference".
Feature chaining provides a means to decouple the actual schemas, but
we build the schema from the XSD on startup, so I'm not sure this make
much practical difference. The point is, the configuration choice of
FeatureChaining determines what schemas are actually dependencies.

so, the final set of schema depdendencies are a function of both the
application schema type hierarchies and the configration choices.
App-schema loads the entire schema, then builds the feature type model
from the configuration. Describe Feature Type should just deliver the
original schema used in the configuration.

I dont see why scanning this catalog should be a problem - its static
(in the configuration vs, run-time sense - not Java Static!), so it
shoiuld be possible to create an index at configuration time.

The amount of memory it uses for a large schema is an issue - but
manageable - we simply need to configure deployments appropriately.

In the case of an "ad-hoc" schema, an artefact of configuration, there
are no dependencies. The schema is defined at configuration-time and I
would guess needs to be cached in its entirety. Each one is unique to
a configuration. App schemas are defined externally. So I dont
understand why you would ever build a schema that links feature types
together. App-schema does this using the pre-built schema, and its not
supported by ad-hoc (pseudo-)simple features.

I suspect there are other issues at play here that I dont really
understand - like modifying the internal model of the WFS schema to
allow collections of user-defeind feature types?

rob

On Tue, Feb 16, 2010 at 9:53 AM, Justin Deoliveira <jdeolive@anonymised.com> wrote:

Hi all,

Continuing from the thread about improving WFS performance I need to get
some input from the appschema experts.

Since the appschema extension supports relationships among feature types
there are now dependencies between feature types. My question is is
there any way know what the dependencies are?

The reason I ask relates to trying to prevent building a schema for
encoding purposes that contains all the types in the catalog. Consider
doing a getfeature request against type X where type X has attributes
that are instances of features of type Y from another namespace.

Consider a getFeature request against type X. Now if type X was simple
there would be no link and I could simply build a schema that contains a
single complex type and everything works great. But in the complex case
I also need to build that schema so that it
imports/includes/incorporates type Y as well.

So again the question is how do I infer this dependency given a
FeatureTypeInfo object representing X? Is this information buried within
the depths of the appschema datastore?

-Justin

--
Justin Deoliveira
OpenGeo - http://opengeo.org
Enterprise support for open source geospatial.

------------------------------------------------------------------------------
SOLARIS 10 is the OS for Data Centers - provides features such as DTrace,
Predictive Self Healing and Award Winning ZFS. Get Solaris 10 NOW
http://p.sf.net/sfu/solaris-dev2dev
_______________________________________________
Geoserver-devel mailing list
Geoserver-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/geoserver-devel

Justin, first please accept my abject apologies for the two month delay in responding to your email. Thank you for your patience.

On 16/02/10 06:53, Justin Deoliveira wrote:

Continuing from the thread about improving WFS performance I need to get
some input from the appschema experts.
Since the appschema extension supports relationships among feature types
there are now dependencies between feature types. My question is is
there any way know what the dependencies are?

Not easy.

At the moment, all information about app-schema feature types is stored in AppSchemaDataAccessRegistry. This static repository was originally a place to allow AppSchemaDataAccess instances to find each other, so that feature chaining could be implemented. Feature chaining is our term for when one feature type is used as property of another.

[The static AppSchemaDataAccessRegistry will be broken by the resource/publishing split. It needs to be managed by a container. We have an internal Jira issue: SISS-358: "Update DataAccessRegistry to adapt to Geotools Repository interface".]

In recent work (still in progress) to support polymorphism and multiple distinct definitions of "feature" types (e.g gsml:CGI_TermValue treated as a feature type for chaining purposes might have multiple definitions as it might be used for different properties), Rini has massively reworked AppSchemaDataAccessRegistry in a way that allows more public access to the inner workings of app-schema.

You can now use the AppSchemaDataAccessRegistry static methods getMappingByName and getMappingByElement to get the underlying FeatureTypeMapping, get list of AttributeMapping instances for that feature type, and see if any is an instance of NestedAttributeMapping (instanceof or use isNestedAttribute() method). A NestedAttributeMapping is the AttributeMapping used for feature chaining.

The problem we have is polymorphism: we do not know the nested feature type until runtime. You have to provide a Feature instance to find the type name of the nested type by formard evaluation of an Expression. For example, a gsml:MappedFeature/gsml:specification might be a gsml:GeologicUnit for one instance of gsml:MappedFeature and a gsml:GeologicFeature for another. This is at the moment buried in an Expression that you would have to walk. This is also private.

Rini knows more. Her efforts are the only reason it works at all!

The reason I ask relates to trying to prevent building a schema for
encoding purposes that contains all the types in the catalog. Consider
doing a getfeature request against type X where type X has attributes
that are instances of features of type Y from another namespace.

Yes. We do this.

Consider a getFeature request against type X. Now if type X was simple
there would be no link and I could simply build a schema that contains a
single complex type and everything works great. But in the complex case
I also need to build that schema so that it
imports/includes/incorporates type Y as well.

Agreed.

So again the question is how do I infer this dependency given a
FeatureTypeInfo object representing X? Is this information buried within
the depths of the appschema datastore?

Yes. There is at the moment no high-level model of the schemas and the relationships between them introduced by the mapping and feature chaining process. However, because all this information is available when the mapping file is being processed, it should be straightforward to collect it and store it for later retrieval, perhaps through the AppSchemaDataAccessRegistry.

Justin, I think your big-picture view of managing schemas is an excellent idea. I would like to find time to think more about it.

Kind regards,

--
Ben Caradoc-Davies <Ben.Caradoc-Davies@anonymised.com>
Software Engineering Team Leader
CSIRO Earth Science and Resource Engineering
Australian Resources Research Centre