[Geoserver-devel] Catalog, namespaces, workspaces, layer names...

Hi,
I'm getting tangled in a seemingly simple issue.
I want to know if a FeatureTypeInfo with a certain
name is available in the catalog already to avoid
duplicates.

Now, logics tells me there should not be two feature
types with the same name in the workspace. Right?

However, the catalog uses a namespace qualified name.
Right, a WFS server should never publish two layers
with the same namespace and same local name.

But isn't the namespace a publishing property, and
so something that should be set at the layer
level? So how do I search the feature types?
Looking them by namespace seems wrong then?

Boys, this incomplete data/publishing split is giving
me a headache :wink:

Cheers
Andrea

--
Andrea Aime
OpenGeo - http://opengeo.org
Expert service straight from the developers.

Hi Andrea,

Andrea Aime wrote:

Hi,
I'm getting tangled in a seemingly simple issue.
I want to know if a FeatureTypeInfo with a certain
name is available in the catalog already to avoid
duplicates.

Now, logics tells me there should not be two feature
types with the same name in the workspace. Right?

I think so. Or do we want to qualify feature types by the name of their store? Given that we qualify by namespace today it might make more sense to make them workspace unique. Sorry... thinking out loud here.

However, the catalog uses a namespace qualified name.
Right, a WFS server should never publish two layers
with the same namespace and same local name.

But isn't the namespace a publishing property, and
so something that should be set at the layer
level? So how do I search the feature types?
Looking them by namespace seems wrong then?

It is, or more it *should* be. I guess for now we just continue with the way things are using namespace, and since namespace is 1-1 to a workspace, we can continue to ride that assumption. When we have resource-pub, we deprecate the getResourceByName methods which take a namespace, and change them to take a workspace.

Boys, this incomplete data/publishing split is giving
me a headache :wink:

Your not the only one ;). I have been working toward implementing the proposal that was done up a while back... work is slow going though.

-Justin

Cheers
Andrea

--
Justin Deoliveira
OpenGeo - http://opengeo.org
Enterprise support for open source geospatial.

Thinking out aloud...

what if we have the same feature type being delivered by multiple
stores - for example a common gazetteer view of the different feature
types. Or sometimes we have several sub-types, and we want to provide
a single view using a supertype.

This situation does occur in practice from my experience with defined
schemas being used.

we'd need a despatcher to route queries to the right data store, but
it would be a shame to get the resource/publishing split making the
assumption that one FeatureType = 1 datastore. Namespaces only exist
in the external contract. Its only the fact that they may be "known"
to the client that gives them meaning - otherwise you might as well
generate random unique names for everything.

Rob

On Wed, Jun 17, 2009 at 12:38 AM, Justin Deoliveira<jdeolive@anonymised.com> wrote:

Hi Andrea,

Andrea Aime wrote:

Hi,
I'm getting tangled in a seemingly simple issue.
I want to know if a FeatureTypeInfo with a certain
name is available in the catalog already to avoid
duplicates.

Now, logics tells me there should not be two feature
types with the same name in the workspace. Right?

I think so. Or do we want to qualify feature types by the name of their
store? Given that we qualify by namespace today it might make more sense
to make them workspace unique. Sorry... thinking out loud here.

However, the catalog uses a namespace qualified name.
Right, a WFS server should never publish two layers
with the same namespace and same local name.

But isn't the namespace a publishing property, and
so something that should be set at the layer
level? So how do I search the feature types?
Looking them by namespace seems wrong then?

It is, or more it *should* be. I guess for now we just continue with the
way things are using namespace, and since namespace is 1-1 to a
workspace, we can continue to ride that assumption. When we have
resource-pub, we deprecate the getResourceByName methods which take a
namespace, and change them to take a workspace.

Boys, this incomplete data/publishing split is giving
me a headache :wink:

Your not the only one ;). I have been working toward implementing the
proposal that was done up a while back... work is slow going though.

-Justin

Cheers
Andrea

--
Justin Deoliveira
OpenGeo - http://opengeo.org
Enterprise support for open source geospatial.

------------------------------------------------------------------------------
Crystal Reports - New Free Runtime and 30 Day Trial
Check out the new simplified licensing option that enables unlimited
royalty-free distribution of the report engine for externally facing
server and web deployment.
http://p.sf.net/sfu/businessobjects
_______________________________________________
Geoserver-devel mailing list
Geoserver-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/geoserver-devel

Rob Atkinson wrote:

Thinking out aloud...

what if we have the same feature type being delivered by multiple
stores - for example a common gazetteer view of the different feature
types. Or sometimes we have several sub-types, and we want to provide
a single view using a supertype.

I may be wrong, but I seem to remember namespaces are to ensure wfs instance level uniqueness on names. Also if I am right the resource/publishing split is gonna allow for mulitple virtual instances served by a single geoserver instance (aka, map, instance, geoserver... not really named yet), hence giving a different context/wfs entry point where you can have the same type name being served?
(note the high level of uncertainty on the answer is honest, no irony :slight_smile:
so http://here/wfs1?...&typeName=ABC and http://here/wfs2?...&typeName=ABC is how you could handle that. Hope that makes sense.

Cheers,
Gabriel

This situation does occur in practice from my experience with defined
schemas being used.

we'd need a despatcher to route queries to the right data store, but
it would be a shame to get the resource/publishing split making the
assumption that one FeatureType = 1 datastore. Namespaces only exist
in the external contract. Its only the fact that they may be "known"
to the client that gives them meaning - otherwise you might as well
generate random unique names for everything.

Rob

On Wed, Jun 17, 2009 at 12:38 AM, Justin Deoliveira<jdeolive@anonymised.com> wrote:

Hi Andrea,

Andrea Aime wrote:

Hi,
I'm getting tangled in a seemingly simple issue.
I want to know if a FeatureTypeInfo with a certain
name is available in the catalog already to avoid
duplicates.

Now, logics tells me there should not be two feature
types with the same name in the workspace. Right?

I think so. Or do we want to qualify feature types by the name of their
store? Given that we qualify by namespace today it might make more sense
to make them workspace unique. Sorry... thinking out loud here.

However, the catalog uses a namespace qualified name.
Right, a WFS server should never publish two layers
with the same namespace and same local name.

But isn't the namespace a publishing property, and
so something that should be set at the layer
level? So how do I search the feature types?
Looking them by namespace seems wrong then?

It is, or more it *should* be. I guess for now we just continue with the
way things are using namespace, and since namespace is 1-1 to a
workspace, we can continue to ride that assumption. When we have
resource-pub, we deprecate the getResourceByName methods which take a
namespace, and change them to take a workspace.

Boys, this incomplete data/publishing split is giving
me a headache :wink:

Your not the only one ;). I have been working toward implementing the
proposal that was done up a while back... work is slow going though.

-Justin

Cheers
Andrea

--
Justin Deoliveira
OpenGeo - http://opengeo.org
Enterprise support for open source geospatial.

------------------------------------------------------------------------------
Crystal Reports - New Free Runtime and 30 Day Trial
Check out the new simplified licensing option that enables unlimited
royalty-free distribution of the report engine for externally facing
server and web deployment.
http://p.sf.net/sfu/businessobjects
_______________________________________________
Geoserver-devel mailing list
Geoserver-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/geoserver-devel

------------------------------------------------------------------------------
Crystal Reports - New Free Runtime and 30 Day Trial
Check out the new simplified licensing option that enables unlimited
royalty-free distribution of the report engine for externally facing server and web deployment.
http://p.sf.net/sfu/businessobjects
_______________________________________________
Geoserver-devel mailing list
Geoserver-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/geoserver-devel

--
Gabriel Roldan
OpenGeo - http://opengeo.org
Expert service straight from the developers.

Andrea Aime wrote:

But isn't the namespace a publishing property, and
so something that should be set at the layer
level?

Some feature types get their namespaces glued on at publishing time. Some have their namespace as an intrinsic property; for example, app-schema feature types.

You think you have problems: a DataAccess can provide feature types with qualified names in different namespaces. See DataAccess: "List<Name> getNames()". Given that a workspace and thus data store are named by the namespace prefix, if you want multiple namespaces from a single DataAccess, you are stuffed. For example, I can create a single DataAccess that returns a gsml:MappedFeature and a sa:SamplingPoint. In which workspace can this live? The workaround is to Not Do That Then, and instead proliferate DataAccesses, one per feature type. The catalog implementation prevents a legitimate use of the DataAccess API. See GEOS-3042.

--
Ben Caradoc-Davies <Ben.Caradoc-Davies@anonymised.com>
Software Engineer, CSIRO Exploration and Mining
Australian Resources Research Centre
26 Dick Perry Ave, Kensington WA 6151, Australia

Rob Atkinson wrote:

what if we have the same feature type being delivered by multiple
stores

This, in my understanding, is one of the goals of the resource/publishing split. I hope one day that we can bind (for example) each workspace to a different WFS service URL in the same GeoServer instance, rather than just have workspace = namespace as we have now. To meet the WFS spec, qnames would have to be unique within each WFS, but not necessarily within the GeoServer instance. Workspaces could be used to deliver the same feature type name, encoded according to different profiles, or backed by different data.

Perhaps profile should not be a container (like workspace), but a property set on each feature type?

--
Ben Caradoc-Davies <Ben.Caradoc-Davies@anonymised.com>
Software Engineer, CSIRO Exploration and Mining
Australian Resources Research Centre
26 Dick Perry Ave, Kensington WA 6151, Australia

Ben Caradoc-Davies wrote:

Andrea Aime wrote:

But isn't the namespace a publishing property, and
so something that should be set at the layer
level?

Some feature types get their namespaces glued on at publishing time. Some have their namespace as an intrinsic property; for example, app-schema feature types.

You think you have problems: a DataAccess can provide feature types with qualified names in different namespaces. See DataAccess: "List<Name> getNames()". Given that a workspace and thus data store are named by the namespace prefix, if you want multiple namespaces from a single DataAccess, you are stuffed. For example, I can create a single DataAccess that returns a gsml:MappedFeature and a sa:SamplingPoint. In which workspace can this live? The workaround is to Not Do That Then, and instead proliferate DataAccesses, one per feature type. The catalog implementation prevents a legitimate use of the DataAccess API. See GEOS-3042.

It does now, but it doesn't need to when the resource/publish split becomes real.
To augment the app-schema case, there's also the cascaded WFS case. If you're cascading featuretype prefix:Road from a wfs and you need to rename it to anotherprefix:Road then you're not just cascading.
But when the resource/publish split it should be heaven.
First, namespace (as per the ones registered in geoserver) stops being a first class citizen to be means to ensure name uniqueness _only_ for those DataAccesses whose feature types do not have an inherent namespace. That is, they ensure name uniqueness at the datastore level, namespace is passed as a DataAccess factory parameter to ensure name uniqueness at the WFS entry point level.
Second, there's gonna be an even better separation of concerns. Namespace is meaningless outside WFS. WCS and WMS do not really need them to refer to a coverage or a wms layer. They will just refer to a resource, which refers to a data store, which refers to a workspace. So no name clash. You'll just need to be sure to _publish_ them with unique "layer" names. Benefits of this are bound to imagination. Say you want to publish the same resource twice in WMS, with different default styles, different CRS's, or using different "definition filters" (common case being a single roads type where you want to publish roads and highways as different layers).

But I have to admit after all this ranting, the original issue keeps being confusing. Reason being there's not a perfect separation of concerns between ResourceInfo and LayerInfo yet. In my mind, ResourceInfo should the closest to the physical resource metadata as possible. In the case of feature types, only the information that can be obtained from the FeatureType, and hence in the UI plainly informative: qualified name, native CRS, native bounds, etc.
LayerInfo should be all about publishing a resource: published CRS, alias, lat/lon bbox, etc.
- Where does namespace fit? both places. In ResourceInfo it should be immutable, even null. Then we should get rid of the namespace datastore parameter. It is a band aid after all, no real API for it, just a convention we injected into geotools for the good of geoserver pre-2.0 :). But the empty namespace is a perfectly valid one in GeoTools. And GeoServer could use that knowledge to infer whether a LayerInfo referring to a ResourceInfo _needs_ to use an overriding namespace (if the resource does not natively provide one), or _can_ use an overriding namespace (just a convenience publishing option).
- are layernames gonna be qualified? for WFS, yes. For WCS/WMS, no need. The lookup mecanism from a non qualified WMS layer name to it's underlying resource is guaranteed since the layer has a direct reference to the resource it's publishing. When you publish the layer just make sure it's name is unique "profile/map" wise (aka, WMS entry point wise).
- what happens if I publish a layer with a different name than it's resource? you're aliasing the resource.
- what happens if I change the published namespace for a resource? you're aliasing the resource.

This is how I see the resource/publish split is gonna give the benefits with the least hassle. I may be missing important topics though. Comments welcome.

Gabriel

--
Gabriel Roldan
OpenGeo - http://opengeo.org
Expert service straight from the developers.

well, for a start its great to see the issue being debated before
finalisation of code :slight_smile:

I'm not over the intricacies of the beast to be honest, just providing
a "sanity check".

One thing I'd point out - namespaces do not exist to disambiguate the
feature type on the server - they exist to disambiguate it once it
leaves the safety of the server - in the outside world. If the
feature type names do not mean anything - just call them all foo:1
foo:2 foo:3 - there is simply no need to have disambiguation around
arbitrary names.

The current linkage between WMS layer name anf featuretype name is a
convenient default - but is purely arbitrary and should be able to be
overridden. Eg hat about a feature type road, and WMS layers road250K
road100K road25K, in a layer group road_anyscale?

IMHO What we're talking about is not the information model - its UI
workflow optimisation for simple cases. Its the bleed-through of this
concern to the resource design that caused such confusion.

so - what is a dataAccess - semantically? is it a featuretype, or is
an adaptor to allow common optimisation of access of a set of feature
types, or is it a connection - the right to access a set of resources?
I've always had a problem with the declaration of namespaces and
database connection parameters in the same file :wink:

Perhaps someone can throw up a formal model of this configuration
metadata - I find text descriptions too prone to misinterpreation.

Rob

On Wed, Jun 17, 2009 at 3:15 PM, Gabriel Roldan<groldan@anonymised.com> wrote:

Ben Caradoc-Davies wrote:

Andrea Aime wrote:

But isn't the namespace a publishing property, and
so something that should be set at the layer
level?

Some feature types get their namespaces glued on at publishing time.
Some have their namespace as an intrinsic property; for example,
app-schema feature types.

You think you have problems: a DataAccess can provide feature types with
qualified names in different namespaces. See DataAccess: "List<Name>
getNames()". Given that a workspace and thus data store are named by the
namespace prefix, if you want multiple namespaces from a single
DataAccess, you are stuffed. For example, I can create a single
DataAccess that returns a gsml:MappedFeature and a sa:SamplingPoint. In
which workspace can this live? The workaround is to Not Do That Then,
and instead proliferate DataAccesses, one per feature type. The catalog
implementation prevents a legitimate use of the DataAccess API. See
GEOS-3042.

It does now, but it doesn't need to when the resource/publish split
becomes real.
To augment the app-schema case, there's also the cascaded WFS case. If
you're cascading featuretype prefix:Road from a wfs and you need to
rename it to anotherprefix:Road then you're not just cascading.
But when the resource/publish split it should be heaven.
First, namespace (as per the ones registered in geoserver) stops being a
first class citizen to be means to ensure name uniqueness _only_ for
those DataAccesses whose feature types do not have an inherent
namespace. That is, they ensure name uniqueness at the datastore level,
namespace is passed as a DataAccess factory parameter to ensure name
uniqueness at the WFS entry point level.
Second, there's gonna be an even better separation of concerns.
Namespace is meaningless outside WFS. WCS and WMS do not really need
them to refer to a coverage or a wms layer. They will just refer to a
resource, which refers to a data store, which refers to a workspace. So
no name clash. You'll just need to be sure to _publish_ them with unique
"layer" names. Benefits of this are bound to imagination. Say you want
to publish the same resource twice in WMS, with different default
styles, different CRS's, or using different "definition filters" (common
case being a single roads type where you want to publish roads and
highways as different layers).

But I have to admit after all this ranting, the original issue keeps
being confusing. Reason being there's not a perfect separation of
concerns between ResourceInfo and LayerInfo yet. In my mind,
ResourceInfo should the closest to the physical resource metadata as
possible. In the case of feature types, only the information that can be
obtained from the FeatureType, and hence in the UI plainly informative:
qualified name, native CRS, native bounds, etc.
LayerInfo should be all about publishing a resource: published CRS,
alias, lat/lon bbox, etc.
- Where does namespace fit? both places. In ResourceInfo it should be
immutable, even null. Then we should get rid of the namespace datastore
parameter. It is a band aid after all, no real API for it, just a
convention we injected into geotools for the good of geoserver pre-2.0
:). But the empty namespace is a perfectly valid one in GeoTools. And
GeoServer could use that knowledge to infer whether a LayerInfo
referring to a ResourceInfo _needs_ to use an overriding namespace (if
the resource does not natively provide one), or _can_ use an overriding
namespace (just a convenience publishing option).
- are layernames gonna be qualified? for WFS, yes. For WCS/WMS, no need.
The lookup mecanism from a non qualified WMS layer name to it's
underlying resource is guaranteed since the layer has a direct reference
to the resource it's publishing. When you publish the layer just make
sure it's name is unique "profile/map" wise (aka, WMS entry point wise).
- what happens if I publish a layer with a different name than it's
resource? you're aliasing the resource.
- what happens if I change the published namespace for a resource?
you're aliasing the resource.

This is how I see the resource/publish split is gonna give the benefits
with the least hassle. I may be missing important topics though.
Comments welcome.

Gabriel

--
Gabriel Roldan
OpenGeo - http://opengeo.org
Expert service straight from the developers.

------------------------------------------------------------------------------
Crystal Reports - New Free Runtime and 30 Day Trial
Check out the new simplified licensing option that enables unlimited
royalty-free distribution of the report engine for externally facing
server and web deployment.
http://p.sf.net/sfu/businessobjects
_______________________________________________
Geoserver-devel mailing list
Geoserver-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/geoserver-devel