[Geoserver-devel] External data dir mess (and solutions too)

Hi,
today I was trying out external data dirs and I can tell you
it's a mess.

http://docs.codehaus.org/display/GEOSDOC/GeoServer+Data+Directory
suggests two ways to work with an external data dir.

The first one, with the /data subdirectory, simply does not work
because FeatureTypes won't be found. GeoserverDataDir will try and
look only for the root of the data dir, and not for the /data subfolder.
I've created http://jira.codehaus.org/browse/GEOS-803 and fixed it
by adding the <data_dir>/data path in the search lookup if an external
data dir is used.

The second one requires people to keep in data/ shapefiles. I fixed this
as well, if the file is not found in <data_dir>/data/<path> it'll be
looked up into <data_dir>/<path>.

What do you think?
Cheers
Andrea

We definitely need users to be allowed to store data in the data_directory, that is the whole point. Especially if they have *huge* data sets and don't want to move them into the geoserver data directory, but want to put the data directory where the large data sets are.
I like your solutions. I definitely think that the users should be able to place their data anywhere in the data directory, not just under /data.

Was it the datastore finder that was causing the issues?

Brent Owens
(The Open Planning Project)

Andrea Aime wrote:

Hi,
today I was trying out external data dirs and I can tell you
it's a mess.

http://docs.codehaus.org/display/GEOSDOC/GeoServer+Data+Directory
suggests two ways to work with an external data dir.

The first one, with the /data subdirectory, simply does not work
because FeatureTypes won't be found. GeoserverDataDir will try and
look only for the root of the data dir, and not for the /data subfolder.
I've created http://jira.codehaus.org/browse/GEOS-803 and fixed it
by adding the <data_dir>/data path in the search lookup if an external
data dir is used.

The second one requires people to keep in data/ shapefiles. I fixed this
as well, if the file is not found in <data_dir>/data/<path> it'll be
looked up into <data_dir>/<path>.

What do you think?
Cheers
Andrea

-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys - and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
_______________________________________________
Geoserver-devel mailing list
Geoserver-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/geoserver-devel

Ok, this is all a bit of a mess for a couple reasons. 1) I never fully got across my vision of how this should work, and thus it was never fully implemented. And 2) trying to be backwards compatible.

So things have gone through many name changes and fixes and weird documentation changes.

The 'data_dir' is a bit of misnomer I think, and it's now been renamed 'conf' in most of our stuff. So 'data directory' used to refer to all the data GeoServer needs to start: the catalog.xml and services.xml, the styles, the validation xml files, ect. And there was a featureTypes directory, which would often have data in it. We figured a nicer thing is to just stick all your data in an actual 'data' directory. And to keep your featureType in a featureTypes directory. But for backwards compatibility we also let people stick their featureTypes directory in the data/ directory, so that the data would get picked up - ie so they could upgrade with just moving their old featureTypes directory over.

So we probably should rename 'GeoServer Data Directory' to GeoServer Config Directory, and start referring to it as that.

The other confusing thing with this stuff is the WEB-INF stuff. This was originally done to hide passwords (stored in catalog.xml) from being exposed. I think we should just drop the WEB-INF stuff, and make it so you can just store a full data directory under WEB-INF, and make that the recommended way for users.

But I think we're going to have to wait for 2.0 for my ideal vision. And then we can more clearly define what exactly a config directory looks like, and hopefully even have a way of defining everything in a single file if people want. And indeed persist to a bunch of different formats.

But since we're not there, you're fixes look good. It gives people more flexibility to define what they want. There's a danger in going too overboard with that, but I think we can just give people a recommended path, not tell them there are other ways of doing things, but then be forgiving if they misconfigure things.

Chris

Andrea Aime wrote:

Hi,
today I was trying out external data dirs and I can tell you
it's a mess.

http://docs.codehaus.org/display/GEOSDOC/GeoServer+Data+Directory
suggests two ways to work with an external data dir.

The first one, with the /data subdirectory, simply does not work
because FeatureTypes won't be found. GeoserverDataDir will try and
look only for the root of the data dir, and not for the /data subfolder.
I've created http://jira.codehaus.org/browse/GEOS-803 and fixed it
by adding the <data_dir>/data path in the search lookup if an external
data dir is used.

The second one requires people to keep in data/ shapefiles. I fixed this
as well, if the file is not found in <data_dir>/data/<path> it'll be
looked up into <data_dir>/<path>.

What do you think?
Cheers
Andrea

-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys - and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
_______________________________________________
Geoserver-devel mailing list
Geoserver-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/geoserver-devel

!DSPAM:1003,4561ef48298452207481331!

--
Chris Holmes
The Open Planning Project
http://topp.openplans.org

While we're at it, I found that:

relative path names only worked against $DD/data/*
and featureTypes are sometimes only picked up when in $DD/featureType, but other times seem to be found OK without duplicating - might be a config UI vs initialisation difference.

Rob

Chris Holmes wrote:

Ok, this is all a bit of a mess for a couple reasons. 1) I never fully got across my vision of how this should work, and thus it was never fully implemented. And 2) trying to be backwards compatible.

So things have gone through many name changes and fixes and weird documentation changes.

The 'data_dir' is a bit of misnomer I think, and it's now been renamed 'conf' in most of our stuff. So 'data directory' used to refer to all the data GeoServer needs to start: the catalog.xml and services.xml, the styles, the validation xml files, ect. And there was a featureTypes directory, which would often have data in it. We figured a nicer thing is to just stick all your data in an actual 'data' directory. And to keep your featureType in a featureTypes directory. But for backwards compatibility we also let people stick their featureTypes directory in the data/ directory, so that the data would get picked up - ie so they could upgrade with just moving their old featureTypes directory over.

So we probably should rename 'GeoServer Data Directory' to GeoServer Config Directory, and start referring to it as that.

The other confusing thing with this stuff is the WEB-INF stuff. This was originally done to hide passwords (stored in catalog.xml) from being exposed. I think we should just drop the WEB-INF stuff, and make it so you can just store a full data directory under WEB-INF, and make that the recommended way for users.

But I think we're going to have to wait for 2.0 for my ideal vision. And then we can more clearly define what exactly a config directory looks like, and hopefully even have a way of defining everything in a single file if people want. And indeed persist to a bunch of different formats.

But since we're not there, you're fixes look good. It gives people more flexibility to define what they want. There's a danger in going too overboard with that, but I think we can just give people a recommended path, not tell them there are other ways of doing things, but then be forgiving if they misconfigure things.

Chris

Andrea Aime wrote:

Hi,
today I was trying out external data dirs and I can tell you
it's a mess.

http://docs.codehaus.org/display/GEOSDOC/GeoServer+Data+Directory
suggests two ways to work with an external data dir.

The first one, with the /data subdirectory, simply does not work
because FeatureTypes won't be found. GeoserverDataDir will try and
look only for the root of the data dir, and not for the /data subfolder.
I've created http://jira.codehaus.org/browse/GEOS-803 and fixed it
by adding the <data_dir>/data path in the search lookup if an external
data dir is used.

The second one requires people to keep in data/ shapefiles. I fixed this
as well, if the file is not found in <data_dir>/data/<path> it'll be
looked up into <data_dir>/<path>.

What do you think?
Cheers
Andrea

-------------------------------------------------------------------------

Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys - and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV

_______________________________________________
Geoserver-devel mailing list
Geoserver-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/geoserver-devel

!DSPAM:1003,4561ef48298452207481331!

-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys - and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
------------------------------------------------------------------------

_______________________________________________
Geoserver-devel mailing list
Geoserver-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/geoserver-devel
  

Rob Atkinson wrote:

While we're at it, I found that:

relative path names only worked against $DD/data/*
and featureTypes are sometimes only picked up when in $DD/featureType, but other times seem to be found OK without duplicating - might be a config UI vs initialisation difference.

Well, this was by design. And the times it works in other ways are due to trying for backwards compatibility.

Maybe it was a bad design, but the idea to me was

$DD/data/ is the directory your data goes.

$DD/featureTypes/ is the directory your featureType meta information goes.

Basically our old way of having data in featureTypes directory was kinda silly. It had no real advantages - it was done in the thought that there would be lots of information in each featureType directory, but that hasn't really come to pass. I think we should maybe just drop the featureType directory and put all the info.xml stuff in a single file.

Then data/ is where you can refer to things relatively. And ideally when you drop it in there we have a directory datastore so you don't even have to create a new datastore, just a new featureType.

Chris

Rob

Chris Holmes wrote:

Ok, this is all a bit of a mess for a couple reasons. 1) I never fully got across my vision of how this should work, and thus it was never fully implemented. And 2) trying to be backwards compatible.

So things have gone through many name changes and fixes and weird documentation changes.

The 'data_dir' is a bit of misnomer I think, and it's now been renamed 'conf' in most of our stuff. So 'data directory' used to refer to all the data GeoServer needs to start: the catalog.xml and services.xml, the styles, the validation xml files, ect. And there was a featureTypes directory, which would often have data in it. We figured a nicer thing is to just stick all your data in an actual 'data' directory. And to keep your featureType in a featureTypes directory. But for backwards compatibility we also let people stick their featureTypes directory in the data/ directory, so that the data would get picked up - ie so they could upgrade with just moving their old featureTypes directory over.

So we probably should rename 'GeoServer Data Directory' to GeoServer Config Directory, and start referring to it as that.

The other confusing thing with this stuff is the WEB-INF stuff. This was originally done to hide passwords (stored in catalog.xml) from being exposed. I think we should just drop the WEB-INF stuff, and make it so you can just store a full data directory under WEB-INF, and make that the recommended way for users.

But I think we're going to have to wait for 2.0 for my ideal vision. And then we can more clearly define what exactly a config directory looks like, and hopefully even have a way of defining everything in a single file if people want. And indeed persist to a bunch of different formats.

But since we're not there, you're fixes look good. It gives people more flexibility to define what they want. There's a danger in going too overboard with that, but I think we can just give people a recommended path, not tell them there are other ways of doing things, but then be forgiving if they misconfigure things.

Chris

Andrea Aime wrote:

Hi,
today I was trying out external data dirs and I can tell you
it's a mess.

http://docs.codehaus.org/display/GEOSDOC/GeoServer+Data+Directory
suggests two ways to work with an external data dir.

The first one, with the /data subdirectory, simply does not work
because FeatureTypes won't be found. GeoserverDataDir will try and
look only for the root of the data dir, and not for the /data subfolder.
I've created http://jira.codehaus.org/browse/GEOS-803 and fixed it
by adding the <data_dir>/data path in the search lookup if an external
data dir is used.

The second one requires people to keep in data/ shapefiles. I fixed this
as well, if the file is not found in <data_dir>/data/<path> it'll be
looked up into <data_dir>/<path>.

What do you think?
Cheers
Andrea

-------------------------------------------------------------------------

Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys - and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV

_______________________________________________
Geoserver-devel mailing list
Geoserver-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/geoserver-devel

-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys - and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
------------------------------------------------------------------------

_______________________________________________
Geoserver-devel mailing list
Geoserver-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/geoserver-devel
  
-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys - and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
_______________________________________________
Geoserver-devel mailing list
Geoserver-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/geoserver-devel

!DSPAM:1003,456382b2183238365517736!

--
Chris Holmes
The Open Planning Project
http://topp.openplans.org

Fair enough

I strongly suggest this stuff is properly refactored _after_ we have a coherent idea of what a featureType is, and not before.

I'm all for separation of concerns, but we need to get them right.

The reason why relative stuff was part of FeatureType, is that many (possibly most) "real" feature types will have relationships to other feature types - so we needed to exploit relative paths to XSD fragments. One of the most common relationships is inheritance (derivation by extension in the case of XML schema) - but also containment, topological relationships etc are usually necessary.

I'll pull together a vision of how such a Geoserver 2 might work, from an ease-of-use standpoint.

Rob

Chris Holmes wrote:

Rob Atkinson wrote:

While we're at it, I found that:

relative path names only worked against $DD/data/*
and featureTypes are sometimes only picked up when in $DD/featureType, but other times seem to be found OK without duplicating - might be a config UI vs initialisation difference.

Well, this was by design. And the times it works in other ways are due to trying for backwards compatibility.

Maybe it was a bad design, but the idea to me was

$DD/data/ is the directory your data goes.

$DD/featureTypes/ is the directory your featureType meta information goes.

Basically our old way of having data in featureTypes directory was kinda silly. It had no real advantages - it was done in the thought that there would be lots of information in each featureType directory, but that hasn't really come to pass. I think we should maybe just drop the featureType directory and put all the info.xml stuff in a single file.

Then data/ is where you can refer to things relatively. And ideally when you drop it in there we have a directory datastore so you don't even have to create a new datastore, just a new featureType.

Chris

Rob

Chris Holmes wrote:

Ok, this is all a bit of a mess for a couple reasons. 1) I never fully got across my vision of how this should work, and thus it was never fully implemented. And 2) trying to be backwards compatible.

So things have gone through many name changes and fixes and weird documentation changes.

The 'data_dir' is a bit of misnomer I think, and it's now been renamed 'conf' in most of our stuff. So 'data directory' used to refer to all the data GeoServer needs to start: the catalog.xml and services.xml, the styles, the validation xml files, ect. And there was a featureTypes directory, which would often have data in it. We figured a nicer thing is to just stick all your data in an actual 'data' directory. And to keep your featureType in a featureTypes directory. But for backwards compatibility we also let people stick their featureTypes directory in the data/ directory, so that the data would get picked up - ie so they could upgrade with just moving their old featureTypes directory over.

So we probably should rename 'GeoServer Data Directory' to GeoServer Config Directory, and start referring to it as that.

The other confusing thing with this stuff is the WEB-INF stuff. This was originally done to hide passwords (stored in catalog.xml) from being exposed. I think we should just drop the WEB-INF stuff, and make it so you can just store a full data directory under WEB-INF, and make that the recommended way for users.

But I think we're going to have to wait for 2.0 for my ideal vision. And then we can more clearly define what exactly a config directory looks like, and hopefully even have a way of defining everything in a single file if people want. And indeed persist to a bunch of different formats.

But since we're not there, you're fixes look good. It gives people more flexibility to define what they want. There's a danger in going too overboard with that, but I think we can just give people a recommended path, not tell them there are other ways of doing things, but then be forgiving if they misconfigure things.

Chris

Andrea Aime wrote:

Hi,
today I was trying out external data dirs and I can tell you
it's a mess.

http://docs.codehaus.org/display/GEOSDOC/GeoServer+Data+Directory
suggests two ways to work with an external data dir.

The first one, with the /data subdirectory, simply does not work
because FeatureTypes won't be found. GeoserverDataDir will try and
look only for the root of the data dir, and not for the /data subfolder.
I've created http://jira.codehaus.org/browse/GEOS-803 and fixed it
by adding the <data_dir>/data path in the search lookup if an external
data dir is used.

The second one requires people to keep in data/ shapefiles. I fixed this
as well, if the file is not found in <data_dir>/data/<path> it'll be
looked up into <data_dir>/<path>.

What do you think?
Cheers
Andrea

-------------------------------------------------------------------------

Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys - and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV

_______________________________________________
Geoserver-devel mailing list
Geoserver-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/geoserver-devel

-------------------------------------------------------------------------

Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys - and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV

------------------------------------------------------------------------

_______________________________________________
Geoserver-devel mailing list
Geoserver-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/geoserver-devel
  
-------------------------------------------------------------------------

Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys - and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV

_______________________________________________
Geoserver-devel mailing list
Geoserver-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/geoserver-devel

!DSPAM:1003,456382b2183238365517736!

-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys - and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
------------------------------------------------------------------------

_______________________________________________
Geoserver-devel mailing list
Geoserver-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/geoserver-devel
  

Rob Atkinson ha scritto:

While we're at it, I found that:

relative path names only worked against $DD/data/*
and featureTypes are sometimes only picked up when in $DD/featureType, but other times seem to be found OK without duplicating - might be a config UI vs initialisation difference.

Well, if my patch works as I expect, you can have feature types
in both $DD/data/featureType and in $DD/featureType now.
It's just a way to make both ways described in the external data dir
guide work (http://docs.codehaus.org/display/GEOSDOC/GeoServer+Data+Directory),
a better long term solution is needed (see my other mail)
Cheers
Andrea

Rob Atkinson ha scritto:

Fair enough

I strongly suggest this stuff is properly refactored _after_ we have a coherent idea of what a featureType is, and not before.

Can you elaborate on this? Is the idea of feature type included
in GML not enough?

I'm all for separation of concerns, but we need to get them right.

The reason why relative stuff was part of FeatureType, is that many (possibly most) "real" feature types will have relationships to other feature types - so we needed to exploit relative paths to XSD fragments. One of the most common relationships is inheritance (derivation by extension in the case of XML schema) - but also containment, topological relationships etc are usually necessary.

Well, here I agree with Chris, I'd like to see everything stored in
a single XML document where you do have stores, feature types, styles
and whatnot, in a coherent and verifiable form, using id references
from one element to the other.
With the current situation you have to jump from one place to the other
to figure out what the heck is going on, and it's way too easy to
break something because you don't change all the relevant references.

Cheers
Andrea

Andrea Aime wrote:

Rob Atkinson ha scritto:

Fair enough

I strongly suggest this stuff is properly refactored _after_ we have a coherent idea of what a featureType is, and not before.

Can you elaborate on this? Is the idea of feature type included
in GML not enough?

A GML schema is a XML schema based representation of a "conceptual schema" (right out of ISO specs..)

This is fine, but it implies that the XMLschema representation is potentially "lossy" - some stuff doesnt translate to schemas easily.

The real issue is that FeatureTypes are defined by the contract between the service and the user, not the contract between the service and the persistence layer designer.

In other words, the requirements for interoperability drive the application schema ( FeatureTypes ), and the service _implements_ a FeatureType. Thus, the configuration should follow this pattern, even if a "temporary Feature Type" is auto-created from the persistence layer, logically the config should keep it separate as if it had been created by an external specification process.

Styles are more often akin to the contract over Feature Types than the persistence layer, and should be linked. Styles may potentially be re-usable across many related feature types, or applied to a subset of features, so simple encapsulation doesnt seem a sensible way of managing them. Current approach is reasonable IMHO

Stores and FeatureTypes are mixed up at the moment. Conceptually a store manages a connection _and_ a persistent object, from which multiple FeatureTypes may be mapped to the underlying persistence layer. I think there is scope to refactor this better.

A single XML document is a bad idea IMHO, although one could be autogenerated from linked fragments if required to see the entire configuration. Tools could deal with the reusable fragments as easily, or more easily, than a monolithic document whose content and structure changes with every functionality change,

I'm all for separation of concerns, but we need to get them right.

The reason why relative stuff was part of FeatureType, is that many (possibly most) "real" feature types will have relationships to other feature types - so we needed to exploit relative paths to XSD fragments. One of the most common relationships is inheritance (derivation by extension in the case of XML schema) - but also containment, topological relationships etc are usually necessary.

Well, here I agree with Chris, I'd like to see everything stored in
a single XML document where you do have stores, feature types, styles
and whatnot, in a coherent and verifiable form, using id references
from one element to the other.
With the current situation you have to jump from one place to the other
to figure out what the heck is going on, and it's way too easy to
break something because you don't change all the relevant references.

Cheers
Andrea

Rob Atkinson ha scritto:

A GML schema is a XML schema based representation of a "conceptual schema" (right out of ISO specs..)

This is fine, but it implies that the XMLschema representation is potentially "lossy" - some stuff doesnt translate to schemas easily.

The real issue is that FeatureTypes are defined by the contract between the service and the user, not the contract between the service and the persistence layer designer.
In other words, the requirements for interoperability drive the application schema ( FeatureTypes ), and the service _implements_ a FeatureType. Thus, the configuration should follow this pattern, even if a "temporary Feature Type" is auto-created from the persistence layer, logically the config should keep it separate as if it had been created by an external specification process.

Well, that's just a matter or having a layer of mapping, such as the
complex data store, properly configured and sitting on top of
the basic data stores.

Stores and FeatureTypes are mixed up at the moment. Conceptually a store manages a connection _and_ a persistent object, from which multiple FeatureTypes may be mapped to the underlying persistence layer. I think there is scope to refactor this better.

Sorry, the persistent object is?

A single XML document is a bad idea IMHO, although one could be autogenerated from linked fragments if required to see the entire configuration. Tools could deal with the reusable fragments as easily, or more easily, than a monolithic document whose content and structure changes with every functionality change,

Which tools? I'm quite XML ignorant, but our current approach leads
to inconsistency. Plus, XML is used just for persistence, what geoserver
deals with is a big in memory model, and each time it stores it, it's
stored in its completeness, we do not change only a few XML files,
so we are really treating the configuration as monolithic, having
split files does not change anything, just allows for more confusion.

A monolithic document is good as long as we keep the configuration
model completely in memory, and can be generated fast enough with
persistence libraries such as XStream.

If you are dealing with a model big enough that in memory handling
does not cut it XML is bad anyways, a dbms should be used instead to have consistency assurances and decent performance (even an in-process
one like h2 or hypersonic, but using cached tables, not in-memory ones).

Cheers
Andrea

The real issue is that FeatureTypes are defined by the contract between the service and the user, not the contract between the service and the persistence layer designer.
In other words, the requirements for interoperability drive the application schema ( FeatureTypes ), and the service _implements_ a FeatureType. Thus, the configuration should follow this pattern, even if a "temporary Feature Type" is auto-created from the persistence layer, logically the config should keep it separate as if it had been created by an external specification process.

Well, that's just a matter or having a layer of mapping, such as the
complex data store, properly configured and sitting on top of
the basic data stores.

True, but this also suggests where to put the user friendliness..

Stores and FeatureTypes are mixed up at the moment. Conceptually a store manages a connection _and_ a persistent object, from which multiple FeatureTypes may be mapped to the underlying persistence layer. I think there is scope to refactor this better.

Sorry, the persistent object is?

shapefile, db table etc

A single XML document is a bad idea IMHO, although one could be autogenerated from linked fragments if required to see the entire configuration. Tools could deal with the reusable fragments as easily, or more easily, than a monolithic document whose content and structure changes with every functionality change,

Which tools? I'm quite XML ignorant, but our current approach leads
to inconsistency.

Ahh - I agree that the monolithic nature of the current approach is problematic. But moving to a monolithic config document is maybe the wrong way to fix this.

Plus, XML is used just for persistence, what geoserver
deals with is a big in memory model, and each time it stores it, it's
stored in its completeness,

so, if part of it was imported, this model is a problem isnt it? Maybe this is the bit to think about, can configurations be loaded and unloaded ina modular fashion. This in turn buys benefits for configuration testing at the very least.

we do not change only a few XML files,
so we are really treating the configuration as monolithic, having
split files does not change anything, just allows for more confusion.

A monolithic document is good as long as we keep the configuration
model completely in memory, and can be generated fast enough with
persistence libraries such as XStream.

OK - but you cant hand off to individual models the responsibility to marshal and unmarshal configurations easily then.

I'd imagine a few config wizards to fix the problem:

Datastore connection manager (smart enough to handle pooling etc)

FeatureType install (from external catalog, or build your own wizard from a persistent object)

Service Profile - use provided machine readable service profile to set up service metadata and supported SRS defaults etc. (May also install a set of FeatureTypes - eg Transport server has Road, Junction, Route, Rail, etc)

Data object to FeatureType mapping wizard.

If you are dealing with a model big enough that in memory handling
does not cut it XML is bad anyways, a dbms should be used instead to have consistency assurances and decent performance (even an in-process
one like h2 or hypersonic, but using cached tables, not in-memory ones).

Its not the in-memory size in question -its whether you can cope with all the interdependencies on modules encapsualted within a single object.

A wizard type approach that joins the artefacts into a loadable subset of configuration elements, and enforces the "foreign key" relationships would be worthwhile, because its hard to configure at the moment, primalrily because all the sources of errors are swallowed. You are right it needs improvement!

Cheers
Andrea

Rob Atkinson ha scritto:

...

Well, that's just a matter or having a layer of mapping, such as the
complex data store, properly configured and sitting on top of
the basic data stores.

True, but this also suggests where to put the user friendliness..

Yes, we need a nice feature type mapping interface... which is pretty
much a waste of time to do with our current UI layer. We need a new
UI layer before starting to talk abound fancy new and easy to use
configuration screens :frowning:

Stores and FeatureTypes are mixed up at the moment. Conceptually a store manages a connection _and_ a persistent object, from which multiple FeatureTypes may be mapped to the underlying persistence layer. I think there is scope to refactor this better.

Sorry, the persistent object is?

shapefile, db table etc

Sorry, I lost you. Why is it a problem having the datastore being both
a persisten object (shapefile) and a connection to it (io streams and th e like)?

Plus, XML is used just for persistence, what geoserver
deals with is a big in memory model, and each time it stores it, it's
stored in its completeness,

so, if part of it was imported, this model is a problem isnt it?

Maybe this is the bit to think about, can configurations be loaded and unloaded ina modular fashion. This in turn buys benefits for configuration testing at the very least.

we do not change only a few XML files,
so we are really treating the configuration as monolithic, having
split files does not change anything, just allows for more confusion.

Well, XStream dumps a tree of objects. If you want to dump a subset, just build a tree with the subset you want to dump and dump it.
When you load it, just merge it with the existing in memory model, dealing with possible conflicts. Having stuff split into files does not
help, the conflict management part has to be done anyways, you can't
just take the module you want to import and add besides the other
existing XML files and hope nothing wrong may happen.

A monolithic document is good as long as we keep the configuration
model completely in memory, and can be generated fast enough with
persistence libraries such as XStream.

OK - but you cant hand off to individual models the responsibility to marshal and unmarshal configurations easily then.

XStream is based on bean reflection. So I don't need any extra help from
modules, just nice compliant javabeans.

Datastore connection manager (smart enough to handle pooling etc)

FeatureType install (from external catalog, or build your own wizard from a persistent object)

Service Profile - use provided machine readable service profile to set up service metadata and supported SRS defaults etc. (May also install a set of FeatureTypes - eg Transport server has Road, Junction, Route, Rail, etc)

Data object to FeatureType mapping wizard.

Well, when we decide to redo the UI (and it's a matter of when, not a matter of "if"), your feeback and design proposal will be much appreciated. I'm tempted to ask about it right now, but I fear the UI
reimplementation delay would make it old when the time to use it comes...

If you are dealing with a model big enough that in memory handling
does not cut it XML is bad anyways, a dbms should be used instead to have consistency assurances and decent performance (even an in-process
one like h2 or hypersonic, but using cached tables, not in-memory ones).

Its not the in-memory size in question -its whether you can cope with all the interdependencies on modules encapsualted within a single object.

This is something you have to handle in the in-memory object model anyways. And yes, you have to do it by hand, nothing helps you there...
only a database with foreign keys around, or an XML with id references, may prevent the configuration to become inconsistent.

A wizard type approach that joins the artefacts into a loadable subset of configuration elements, and enforces the "foreign key" relationships would be worthwhile, because its hard to configure at the moment, primalrily because all the sources of errors are swallowed. You are right it needs improvement!

I don't see how a wizard (which to me is just a UI pattern) can deal
with foreign keys...

Cheers
Andrea