R: [Geoserver-users] R: [Geoserver-devel] Ingestion Engine propos al

--------------------------------------------------------------
-------------
> I think the configuration part in GeoServer is not its best one,
> because it intermixes three concerns that should instead be
> separated:
> - the config data itself
> - the config data persistence
> - the servlet
>
> The configuration data, for me, is the MEMORY IMAGE a given running
> instance
> of GeoServer has in any given instant in time, and that has nothing
> to do
> with how that information was generated, beeing it from an XML file,
> from a
> DB, etc.
> So each configuration mechanism will persist its own data
in the form
> it
> likes more, there's no need to save and load it from a
single source.
>
> What I'm saying is that the configuration data must be
separated from
> it's
> persistence.
> Now GeoServer uses several files and directories to store its
> configuration
> and it's
> only able to load the configuration from that files.
This isn't true. It's also able to 'load' its configuration from the
web admin tool. We are actually completely independant from the xml
files, it's just the only way we have to _persist_. But the
structures
are all there to take different ways to load and store files.
This was
a side effect of me being anal about separation with the building of
the web admin tool. All loading is done through DTO objects. The xml
loader creates DTO objects, as does the web admin tool. If you want a
datastore loader, you just have to make it create the appropriate dto
files. You just have to write some code to make the initial geoserver
start up get its DTO's from some other source. Shouldn't be that hard
to code, could probably just pass it in as a web.xml param, default to
xml, but also allow the others. You may have to add one slight layer
of abstraction, but the core is there, loading the memory image of
geoserver is completely abstracted, it just needs to get into DTOs.

I should have pointed out it more precisely. This is what I actually did.
We have a mechanism that creates DTOs using our Meta engine and use those
to replace the memory image that GeoServer loaded from it's config files.
I didn't even tried to mix the two, because we didn't need it,
but maybe it couldn't be that hard...
But we're not persisting DTOs, we're building them on-the-fly
using out own info.

> Instead the catalog.xml and all others should be only one
of the many
> different
> persistence forms for GeoServer's configuration data. That is, the
> catalog.xml
> file and the pieces of Java code able to manage it, should be no
> different
> from the ingestion engine or from whatever other mechanism one can
> invent.
Agreed. But I think we already have a decent start to this
structure.
Granted it could use a nice rewrite, but I think you can do a smaller
incremental step with the ingestion engine - it just writes out DTOs
and says 'load'. That can then be adjusted with the web admin tool.
And if you want you can use the current structure and write out a
different persistance mechanism.

Yes, maybe I didn't liked the fact to persist the DTOs and then reload them,
but this is because I have another mean of persist the information needed
to create DTOs, otherwise I'd probably do the same.

>
> The memory image of the configuration data is kept by GeoServer
> inside
> the ServletContext, so that it's globally accessible from
any servlet
> running inside a given GeoServer instance and from any HTTP
> request/response
> beeing serviced by that servlet. So you can modify that memory image
> on-the-fly
> using a servlet filter, without any need to store it using the same
> persistence mechanism that GeoServer uses now.
Or you can pass it a DTO object. I admit that that DTO's may be one
level of abstraction too far, could use the memory image directly, but
the structure is there to do what you want.

Yes sure, I modify the memory image using DTOs, not directly,
I just bypassed the persistence of DTOs.

>
> Anyway, if one would prefer to persist all the config info
inside the
> same store, I won't use files and directories for that, I'd use
> DataStores.
> GeoServer uses GeoTools to access data, and GeoTools uses DataStores
> to that purpose. Configuration data are no different from
other data,
> so I believe the right place to store it is inside DataStores.
> There's a GMLDataStore (although it's read-only at the moment), so
> one
> can use it if he likes to store config info inside XML files.
Cool, you can code this up and we can all try it out. Just have it
create and save DTOs. If it ends up better and more flexible, we can
adopt it. This is what dave was getting at with a 'new' web admin
tool, we can still make use of the old one. And you can do the same
with persisting to data stores. And if it fully does the xml files
right, then we can switch it all over.

If only I could find the time...
Anyway I want to point out that there's no need to write out exactly
GeoServers DTOs. You have to write out all the information you need
to create a DTO, but you can do that in whatever format you like to.
Then you have to be able to create a DTO using your own info and send
that DTO to the memory image of the running GeoServer. This is what I did.
The DataStores connection params are inside JBoss' XML .SAR files,
and the metadata about FeatureTypes come from MetaStores, and they're
sufficient to build GeoServer DTOs, so we have no need to persist them
using the GeoServer config files.

>
> I think that the main concern is not how one persist the
config info,
> but it's the config info itself, that is the Catalog.
> Now GeoServer uses FeatureTypeInfo and related classes to construct
> its memory image of configuration. Are that classes good enough
> to support coverages too??? Are they flexible/extensible enough
> to support future services??? Are they "compatible" with classes
> used by other systems (like uDig for example)???
> I feel this is what we should think about, the mean by which these
> catalog classes are persisted should be irrilevant.
>
> One distinction must be made here. I think that there're two
> different
> levels of config info. One is about DataStores, that is how to
> connect
> to each source of data. The other is about FeatureTypes, that is
> which
> data is available, how it is structured, how it must be validated,
> who has the rights to read or modify it, etc.
> DataStore config info can be in simple XML files or
whatever, because
> you only have to say how to connect to that source of data, so the
> connection params are basically all that's needed.
> FeatureType config info are a different story, or better they're no
> config info at all, they are metadata and they belong to
the Catalog.
> In our vision, and in what we have implemented so far, to add new
> data
> to GeoServer, you simply have to add an XML file with the connection
> params
> of the DataStore. The metadata for each FeatureType
contained in that
> DataStore
> is read from the DataStore itself. That is, for each DataStore you
> want to
> add
> to GeoServer you MUST also add what we call a MetaStore, that is a
> persistent
> form of the metadata for the FeatureTypes contained in the DataStore
> itself.
> Together with the MetaStore you MUST specify a Loader able to load
> that
> persisted metadata in a memory image, that is a Catalog. We have a
> few
> Loader
> implemented, one is able to load metadata from
FeatureSources (aka DB
> tables)
> with a specific structure, another one is able to infer a
minimum set
> of metadata from the DataStore itself, so that you can also add a
> DataStore
> for which you actually don't have a proper MetaStore.
> So we have this catalog in memory and we can use it to configure
> GeoServer
> on-the-fly using a servlet filter (actually it is a Tomcat Valve)
> that
> directly
> "talks" to the GeoServer's catalog building DTOs on the fly.
>
> But there's more to it...
>
> We're also more and more convinced that some of the things GeoServer
> is able
> to do now should be moved to GeoTools. Validation and
> Transactionality
> should be in GeoTools, probably even the GetFeature operation should
> be in
> GeoTools.
> The central point of it all is basically about be able to operate
> (querying, reading, writing, validating, etc.) against a set of
> DataStores
> instead that against each one separately, and this capability should
> be in
> GeoTools,
> not in GeoServer.
I agree with all of this. And this is easily one of the primary goals
of the geoserver 2.0 rewrite. Get the catalog and meta information in
geotools.

Yes, sure, this is a long term goal, and we all have more immediate needs,
so I won't force anyone, nor myself, in waiting...

>
> If things were like that, many of the config info now used by
> GeoServer,
> the ones regarding FeatureTypes metadata, would go into the GeoTools
> catalog
> configuration.
>
> GeoServer will then have only the config info relevant to each OGC
> service
> it exposes (WFS, WMS, WCS, etc.) and it only needs to have
references
> to the
> metadata configured inside the GeoTools catalog.
>
> So each service plugged into GeoServer will have it's own
> configuration
> mechanism,
> for its own configuration info. GeoServer will only need a system to
> configure the plugins,
> data will instead be configured inside the GeoTools catalog.
And that's basically the other big goal, geoserver as the plug-in
machine.

Another distant goal...

>
--------------------------------------------------------------
-------------
>
> ...that was a very long dissertation, I'm sorry...
> And implementing it is much more impacting then what you're
> proposing,
> so it may take a while to do (even if we already implemented a
> certain part
> of it).
> Also in this very moment we have other more urgent aspects to see
> after,
> so I'm afraid I won't be able to heavily work on this for a while.
> I'm very sorry about this, because I'd like to see others using what
> we've done so far, I'd have to find time to make it general enough
> and to publish it...
Yeah, I think we're pretty much agreed on these things as long term
goals, what I'm calling geoserver 2.0. For the immediate I think just
focusing on DTOs and the current mechanisms we have in place will make
for a decent ingestion engine, and we can take the lessons from there
and apply them to the next config design, since with 2.0 we can even
redo the format of the config files if we'd like. I'd always like the
XML option, and the datastore persistance idea is interesting - being
able to persist to any datastore would be nice. We'll have to play
with it and see.

I hope to have clarified what I was saying about DTOs, persistence, etc.
And I hope to be able soon enough to publish a working demo, or something...

best regards,

Chris

   Bye Paolo

>
> Bye
> Paolo Rizzi
>
>
>
> > -----Messaggio originale-----
> > Da: Alessio Fabiani [mailto:alessio.fabiani@anonymised.com]
> > Inviato: martedì 19 luglio 2005 11.30
> > A: geoserver-devel@lists.sourceforge.net;
> > geoserver-users@lists.sourceforge.net
> > Oggetto: [Geoserver-devel] Ingestion Engine proposal
> >
> >
> > Hi all,
> > I will explain in this email our proposal for a GeoServer
> > Ingestion Engine.
> >
> > The Ingestion Engine we would like to implement for GeoServer
> should
> > be configured as a PlugIn that an Administrator can plug into
> > GeoServer and use as an alternative to the web interface to manage
> the
> > configuration files, i.e. the "catalog.xml" which is where
> NameSpaces,
> > DataStores and CoverageFormats parameters are stored and the
> different
> > "info.xml" associated to each GeoServer features and coverages
> which
> > is where all the information relative to the FeatureType or
> > GridCoverage are stored.
> >
> > In order to achieve this objective, we do not want to modify the
> > actual GeoServer configuration concept, at this moment every time
> an
> > Administrator wants to add a new FeatureType or Coverage to
> GeoServer
> > he has to follow several steps:
> >
> > Step 1: Defining the parameters and the ID for a new DataStore or
> > Format. In the new release of GeoServer-WCS experiment we have
> renamed
> > DataStore as FeatureStore and Format as CoverageStore because they
> are
> > theoretically the same thing respectively for Vectorial
and Gridded
> > data. GeoServer stores all those informations in the catalog.xml.
> >
> > Step 2: Creating a new FeatureType or GridCoverage starting from
> the
> > Store created in the Step 1. GeoServer creates a new
directory with
> > the same name of the Store ID and Feature/Coverage name and stores
> > inside an info.xml file containing all the metadata associated to
> the
> > latter.
> >
> > Notice that GeoServer makes the configuration files
persistent only
> > after the Administrator does a Save action by clicking over the
> button
> > associated.
> >
> > The Ingestion Engine we have in mind should be able to
perform Step
> 1,
> > Step 2 and Save configuration automatically.
> >
> > We have two main objectives to achieve:
> >
> > 1. Building something that is pluggable and unplaggable to
> > GeoServer
> > 2. Building something that allows GeoServer to
automatically modify
> > the configuration performing the above steps without removing the
> > actual GeoServer configuration management system
> >
> > To achieve the first objective we think about building a Servlet
> with
> > his own classes that the administrator can add/remove, configure
> and
> > enable/disable by GeoServer web.xml. This Servlet will work on a
> > temporal based schedule by simply checking the file system
> structure
> > for changes.
> > To achieve the second objective the Servlet simply will automatize
> the
> > Administrator steps for each change.
> >
> > How the servlet works:
> > First of all we do not want to force users to maintain a
predefined
> > file system structure. We think about a system that mainly leaves
> > unaltered the file system manually created by the Administrator
> using
> > the web interface but creates and maintains it's own structure for
> the
> > subdirectories automatically managed, compatible with the first
> one.
> >
> > Suppose that the Administrator wants to create a new subdirectory
> > automatically managed by the GeoServer Ingestion Engine for a set
> of
> > files belonging to a particular Store.
> > What he has to do is creating this subdirectory and placing inside
> it
> > a particular xml file which describes the Store type the common
> > parameters and the metadata that the Ingestion Engine will use. An
> > external tool, that we want to create too, can be used to create
> this
> > configuration file. The Ingestion Servlet will scan the directory
> and
> > every time it will encounter a new compatible file it
will create a
> > new subdirectory where this file (and all related) will be moved
> and
> > the relative info.xml will be created. For FeatureStores like
> postgis
> > those files can be just xml files containing the parameters named
> like
> > the final FeatureType. If the Administrator deletes one or more of
> > those subdirectories, the Ingestion Servlet will remove the
> > Features/Coverages (and the associated Store) from the GeoServer
> > configuration. Notice that the Administrator can even manually
> remove
> > those features/coverages by using the web interface.
> >
> > Attached there are two images that show how the Ingestion Engine
> > should work on an "auto-managed" subdirectory.
> >
> > Moreover notice that by adding few more metadata informations we
> can
> > even handle WMS nested layers. We don't need to reflect the exact
> File
> > System tree structure, we can even build a virtual WMS layer tree
> > structure by handling some metadata. I will explain in detail in
> the
> > next email.
> >
>
>
> -------------------------------------------------------
> SF.Net email is sponsored by: Discover Easy Linux Migration
> Strategies
> from IBM. Find simple to follow Roadmaps, straightforward articles,
> informative Webcasts and more! Get everything you need to get up to
> speed, fast. http://ads.osdn.com/?ad_idt77&alloc_id492&op=click
> _______________________________________________
> Geoserver-users mailing list
> Geoserver-users@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/geoserver-users
>

----------------------------------------------------------
This mail sent through IMP: https://webmail.limegroup.com/

P. Rizzi Ag.Mobilità Ambiente wrote:

I think the configuration part in GeoServer is not its best one,
because it intermixes three concerns that should instead be
separated:
  - the config data itself
  - the config data persistence
  - the servlet
     

If you read the design docs you will see that there is a strong separation betwen the three:
1) The Running Servlets can be configured programatically
2) Persistence of configuration is handled separately
3) Web user interface is also completly separate.

Between all these bits we have DataTransferObjects, that are used to represent a configuration allowing the parts to be separate.

Docs:
- http://vwfs.refractions.net/docs/WebConfigImplementationReport.pdf
- http://vwfs.refractions.net/docs/GeoserverConfigDesign.pdf
- http://vwfs.refractions.net/docs/Final%20Report.pdf

Cheers,
Jody