[Geoserver-devel] proposal for new data directory structure

Hi all,

As part of recent hacking moving towards 2.0 i have been working on a new data directory for 2.x better suited to our configuration. Here is the proposal. Feedback welcome.

http://geoserver.org/display/GEOS/GSIP+34+-+New+data+directory+structure+for+2.x

-Justin

--
Justin Deoliveira
OpenGeo - http://opengeo.org
Enterprise support for open source geospatial.

Justin Deoliveira ha scritto:

Hi all,

As part of recent hacking moving towards 2.0 i have been working on a new data directory for 2.x better suited to our configuration. Here is the proposal. Feedback welcome.

http://geoserver.org/display/GEOS/GSIP+34+-+New+data+directory+structure+for+2.x

Generally speaking looks good. A few details:
- nothing states how the granular saving is occurring (but we know, for
   example, that XStream is used to persist the xml). Citing the
   event subsystem would shed some light on the machinery (also,
   when are the events triggered?)
- the proposal should say how the old configuration get converted
   to new ones (automatically, interactively, in place or in a
   different directory?)
- I guess to get that output some of the objects have been
   "massaged" setting up XStream aliases in order to get
   better looking output?
   The wfs.xml "gml" section looks confusing, I cannot really make
   up what that is...
- in featureType.xml, do we actually need the list of attributes?
   We should be able to make the list up by looking at the native
   schema and the schema.xml files. (eventual mapping information
   will be stored in a layer configuration, and is anyways out
   of scope with the current state of the art right?)
- what about the freemarker templates?

Cheers
Andrea

--
Andrea Aime
OpenGeo - http://opengeo.org
Expert service straight from the developers.

Thanks for the feedback Andrea, a few comments inline.

Andrea Aime wrote:

Justin Deoliveira ha scritto:

Hi all,

As part of recent hacking moving towards 2.0 i have been working on a new data directory for 2.x better suited to our configuration. Here is the proposal. Feedback welcome.

http://geoserver.org/display/GEOS/GSIP+34+-+New+data+directory+structure+for+2.x

Generally speaking looks good. A few details:
- nothing states how the granular saving is occurring (but we know, for
  example, that XStream is used to persist the xml). Citing the
  event subsystem would shed some light on the machinery (also,
  when are the events triggered?)
- the proposal should say how the old configuration get converted
  to new ones (automatically, interactively, in place or in a
  different directory?)

Updated. I added two sections near the end of the proposal which should provide a bit more info.

- I guess to get that output some of the objects have been
  "massaged" setting up XStream aliases in order to get
  better looking output?
  The wfs.xml "gml" section looks confusing, I cannot really make
  up what that is...

Agreed, it is confusing. The wfs info class holds onto a map of GMLInfo, keyed by version. This is indeed something i have not "massaged" yet. That said I am not too worried about making it look all that pretty because people should not be mucking with these files directly now that we have a rest configuration api, although i realize that we don;'t have one in place for service configuration yet.

So, will opening a ticket to make it look nice do for now?

- in featureType.xml, do we actually need the list of attributes?
  We should be able to make the list up by looking at the native
  schema and the schema.xml files. (eventual mapping information
  will be stored in a layer configuration, and is anyways out
  of scope with the current state of the art right?)

Technically we could omit them yes, since the check for a schema.xsd override is checked on startup. That said persisting them might be a good idea with regard to the future. With the attributes persisted in the xml file a user could edit that file directly (yes I know i just discouraged this :)) rather than hacking out xml schema which is verbose. Also if we ever wanted to provide a simple user interface for attribute editing (one that has nothing to do with xml schema) then this would require we persist them. Regardless, if you feel strongly we can make sure they are not persisted.

- what about the freemarker templates?

Added to the proposal. Works the same as 1.x.

Cheers
Andrea

--
Justin Deoliveira
OpenGeo - http://opengeo.org
Enterprise support for open source geospatial.

Hi,

Thanks for the proposal, I just think I'm getting lost though, so excuse me if I'm getting it all wrong:

I don't quite see the separation between data and publishing by layer.xml being inside workspaces/<ws>/<ds>/<ft>/

Something like the following would make more sense to me, at least with the idea I have of how things are supposed to be:

...
workspaces/...
namespaces/...
styles/...
layergroups/...
templates/...
maps/
  map1/
    layer1.xml
    layer2.xml
    layer3.xml
  map2/
    layer1.xml
    ...

If the proposed structure is like it is because that better reflets "how things actually are now", doesn't that mean another data directory change proposal will be needed when we finally support more than one "map" or "virtual instance" or watever it ends up being called like?

To be honest I would be more comfortable if we find the way to start supporting the concept of "map" right now, since implementation wise we can as easily have a single(ton) map.
But that seems to be topic for a separate discussion cause I already have some with andrea in person I think. Still, my point is I don't quite see the reason to have a new catalog design that accounts for the separation of data and publishing if we don't use it, even if for the time being there's only one of such "maps".

My 2c.-

Gabriel

Andrea Aime <aaime@anonymised.com> escribió:

Justin Deoliveira ha scritto:

Hi all,

As part of recent hacking moving towards 2.0 i have been working on a
new data directory for 2.x better suited to our configuration. Here is
the proposal. Feedback welcome.

http://geoserver.org/display/GEOS/GSIP+34+-+New+data+directory+structure+for+2.x

Generally speaking looks good. A few details:
- nothing states how the granular saving is occurring (but we know, for
   example, that XStream is used to persist the xml). Citing the
   event subsystem would shed some light on the machinery (also,
   when are the events triggered?)
- the proposal should say how the old configuration get converted
   to new ones (automatically, interactively, in place or in a
   different directory?)
- I guess to get that output some of the objects have been
   "massaged" setting up XStream aliases in order to get
   better looking output?
   The wfs.xml "gml" section looks confusing, I cannot really make
   up what that is...
- in featureType.xml, do we actually need the list of attributes?
   We should be able to make the list up by looking at the native
   schema and the schema.xml files. (eventual mapping information
   will be stored in a layer configuration, and is anyways out
   of scope with the current state of the art right?)
- what about the freemarker templates?

Cheers
Andrea

--
Andrea Aime
OpenGeo - http://opengeo.org
Expert service straight from the developers.

------------------------------------------------------------------------------
Apps built with the Adobe(R) Flex(R) framework and Flex Builder(TM) are
powering Web 2.0 with engaging, cross-platform capabilities. Quickly and
easily build your RIAs with Flex Builder, the Eclipse(TM)based development
software that enables intelligent coding and step-through debugging.
Download the free 60 day trial. http://p.sf.net/sfu/www-adobe-com
_______________________________________________
Geoserver-devel mailing list
Geoserver-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/geoserver-devel

groldan@anonymised.com ha scritto:

Hi,

Thanks for the proposal, I just think I'm getting lost though, so excuse me if I'm getting it all wrong:

I don't quite see the separation between data and publishing by layer.xml being inside workspaces/<ws>/<ds>/<ft>/

Something like the following would make more sense to me, at least with the idea I have of how things are supposed to be:

...
workspaces/...
namespaces/...
styles/...
layergroups/...
templates/...
maps/
map1/
   layer1.xml
   layer2.xml
   layer3.xml
map2/
   layer1.xml
   ...

If the proposed structure is like it is because that better reflets "how things actually are now", doesn't that mean another data directory change proposal will be needed when we finally support more than one "map" or "virtual instance" or watever it ends up being called like?

To be honest I would be more comfortable if we find the way to start supporting the concept of "map" right now, since implementation wise we can as easily have a single(ton) map.
But that seems to be topic for a separate discussion cause I already have some with andrea in person I think. Still, my point is I don't quite see the reason to have a new catalog design that accounts for the separation of data and publishing if we don't use it, even if for the time being there's only one of such "maps".

Maybe there is still a way to accomodate both.
During the discussions about maps the point that not everyone needs
them was made, and that we should support the extra complexity, but
not force it onto the users that have no need for it.
The idea was to have a default map that lists all of the layers in
the configuration, using a default layer configuration. The same
default can be used as a template if you need to provide the same
layer, or group of layers, in multiple "map" instances (treating
it like a pointer).
So we could say the proposed structure represents the default
map already, and that the maps configured are only the explicit
ones, beyond the default one.

Anyways, I'm not against the separation, as we said the data directory
layout in not intended for direct manipulation, I'm more
concerned about a smooth upgrade path and a simple setup for
simple usages than about a particular data directory layout.

Cheers
Andrea

--
Andrea Aime
OpenGeo - http://opengeo.org
Expert service straight from the developers.

Justin Deoliveira wrote:

Hi all,

As part of recent hacking moving towards 2.0 i have been working on a new data directory for 2.x better suited to our configuration. Here is the proposal. Feedback welcome.

http://geoserver.org/display/GEOS/GSIP+34+-+New+data+directory+structure+for+2.x

-Justin

I thought we were going to cover up the distinction between coverage store and data store ? If so, I think we should do the same in directory structure and use subclassing as needed.

How are modules / plugins expected to add to this configuration ? New files in the directories, or do we provide some hooks to put it straight into the XML files? If we do the latter, then core modules should use these hooks too to ensure that they're actually usable and maintained.

-Arne

--
Arne Kepp
OpenGeo - http://opengeo.org
Expert service straight from the developers

groldan@anonymised.com wrote:

Hi,

Thanks for the proposal, I just think I'm getting lost though, so excuse me if I'm getting it all wrong:

I don't quite see the separation between data and publishing by layer.xml being inside workspaces/<ws>/<ds>/<ft>/

Something like the following would make more sense to me, at least with the idea I have of how things are supposed to be:

...
workspaces/...
namespaces/...
styles/...
layergroups/...
templates/...
maps/
map1/
   layer1.xml
   layer2.xml
   layer3.xml
map2/
   layer1.xml
   ...

If the proposed structure is like it is because that better reflets "how things actually are now", doesn't that mean another data directory change proposal will be needed when we finally support more than one "map" or "virtual instance" or watever it ends up being called like?

You bring up a good point. And indeed the map structure you list will be needed once we have the publishing split. The plan as I saw it was to introduce that change when the time comes. And for now keep layer.xml next to the feature type / coverage configuration.

Since the change to the data directory structure will be additive I do not see it as a big deal. We have added directories to the data dir structure as need be before, templates, security, palettes, etc...

That said we could adopt this structure now, i would not be against that if people think that introducing it now is the better way to go.

To be honest I would be more comfortable if we find the way to start supporting the concept of "map" right now, since implementation wise we can as easily have a single(ton) map.
But that seems to be topic for a separate discussion cause I already have some with andrea in person I think. Still, my point is I don't quite see the reason to have a new catalog design that accounts for the separation of data and publishing if we don't use it, even if for the time being there's only one of such "maps".

We could do this but imo it increases scope in the short term. And does not really buy us much since the changes required to the data directory can be done pretty cleanly once we have the map data structure in place.

My 2c.-

Gabriel

Andrea Aime <aaime@anonymised.com> escribió:

Justin Deoliveira ha scritto:

Hi all,

As part of recent hacking moving towards 2.0 i have been working on a
new data directory for 2.x better suited to our configuration. Here is
the proposal. Feedback welcome.

http://geoserver.org/display/GEOS/GSIP+34+-+New+data+directory+structure+for+2.x

Generally speaking looks good. A few details:
- nothing states how the granular saving is occurring (but we know, for
   example, that XStream is used to persist the xml). Citing the
   event subsystem would shed some light on the machinery (also,
   when are the events triggered?)
- the proposal should say how the old configuration get converted
   to new ones (automatically, interactively, in place or in a
   different directory?)
- I guess to get that output some of the objects have been
   "massaged" setting up XStream aliases in order to get
   better looking output?
   The wfs.xml "gml" section looks confusing, I cannot really make
   up what that is...
- in featureType.xml, do we actually need the list of attributes?
   We should be able to make the list up by looking at the native
   schema and the schema.xml files. (eventual mapping information
   will be stored in a layer configuration, and is anyways out
   of scope with the current state of the art right?)
- what about the freemarker templates?

Cheers
Andrea

--
Andrea Aime
OpenGeo - http://opengeo.org
Expert service straight from the developers.

------------------------------------------------------------------------------

Apps built with the Adobe(R) Flex(R) framework and Flex Builder(TM) are
powering Web 2.0 with engaging, cross-platform capabilities. Quickly and
easily build your RIAs with Flex Builder, the Eclipse(TM)based development
software that enables intelligent coding and step-through debugging.
Download the free 60 day trial. http://p.sf.net/sfu/www-adobe-com
_______________________________________________
Geoserver-devel mailing list
Geoserver-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/geoserver-devel

--
Justin Deoliveira
OpenGeo - http://opengeo.org
Enterprise support for open source geospatial.

Anyways, I'm not against the separation, as we said the data directory
layout in not intended for direct manipulation, I'm more
concerned about a smooth upgrade path and a simple setup for
simple usages than about a particular data directory layout.

I am not against modeling the separation in the data directory as of now. But I am against adding the notion of a map into the backend picture right now. It increases scope and we have bigger fish to fry at the moment imho.

As I see it the upgrade path will be pretty smooth:

1. adopt data dir structure proposed here
2. implement maps in the backend
3. add the maps directory to the data directory structure
4. on start provide an "import facility" which creates a default map as described above

Cheers
Andrea

--
Justin Deoliveira
OpenGeo - http://opengeo.org
Enterprise support for open source geospatial.

Arne Kepp wrote:

Justin Deoliveira wrote:

Hi all,

As part of recent hacking moving towards 2.0 i have been working on a new data directory for 2.x better suited to our configuration. Here is the proposal. Feedback welcome.

http://geoserver.org/display/GEOS/GSIP+34+-+New+data+directory+structure+for+2.x

-Justin

I thought we were going to cover up the distinction between coverage store and data store ? If so, I think we should do the same in directory structure and use subclassing as needed.

We could do that yeah, instead of having coveragestore.xml and datastore.xml we could just make it store.xml. Not sure it gains us much , while the code reading the configuration might be made a bit simpler, it makes it harder to figure out what is what when you are looking at. Size in one, half dozen in the other i guess :). Anyone else have an opinion?

How are modules / plugins expected to add to this configuration ? New files in the directories, or do we provide some hooks to put it straight into the XML files? If we do the latter, then core modules should use these hooks too to ensure that they're actually usable and maintained.

Good question. For things like new services it should be pretty straight forward, and there is already an extension point in place for doing so. Basically each service gets its own .xml file. For things like output formats, the metadata map of a service seems like a logical way to go. There is an extension point (GeoServerIntializer) in which modules can use to initialize themselves after configuration has been read. For new types of resources and entities in the catalog... it gets a bit tricky. While the catalog interfaces are written in a way to make them extensible, most of the code accessing them and persistence stuff does not assume that. So to be able to have pluggable resources some work would need to be done. How they are stored... i guess the same way data stores and coverages, feature types and coverages are stored.

-Arne

--
Justin Deoliveira
OpenGeo - http://opengeo.org
Enterprise support for open source geospatial.