[Geoserver-devel] GSIP 106 - Managed File API

Thanks for wait - proposal is now ready:

http://geoserver.org/display/GEOS/GSIP+106±+Managed+File+API

The proposal consists of a “ResourceStore” API for streaming based access to resources and a plan to transition the codebase.

I am quite happy with how the interfaces have turned out. You can evaluate for yourself here:

This branch includes:
a) A default FileSystemResourceStore implementation to try out the ResourceStore / Resource interfaces
b) Integration with GeoServerResourceLoader (this has some QA checks in place to assist during migration)

Kevin is working on a JDBCResourceStore implementation for the jdbc config extension (to confirm the API will work using database blobs).

Jody Garnett

Thanks, Jody, a fine piece of work. +1 from me.

Does this API support managed files whose names can only be determined from the content of other managed files? For example, app-schema datastore.xml files reference mapping files which themselves can refer to other mapping files. Would the Resource.list method allow the management of such files? I suspect this is similar situation to icons referenced by SLDs.

What is the workflow for migrating an existing app-schema configuration into a JDBC config? If did did have to manually edit a file, how should we get it out and back in again?

Kind regards,
Ben.

On 25/02/14 08:22, Jody Garnett wrote:

Thanks for wait - proposal is now ready:

http://geoserver.org/display/GEOS/GSIP+106+-+Managed+File+API

The proposal consists of a "ResourceStore" API for streaming based
access to resources and a plan to transition the codebase.

I am quite happy with how the interfaces have turned out. You can
evaluate for yourself here:
- https://github.com/boundlessgeo/geoserver/tree/resource_store

This branch includes:
a) A default FileSystemResourceStore
<https://github.com/boundlessgeo/geoserver/blob/resource_store/src/platform/src/main/java/org/geoserver/platform/resource/FileSystemResourceStore.java&gt;
implementation to try out the ResourceStore
<https://github.com/boundlessgeo/geoserver/blob/resource_store/src/platform/src/main/java/org/geoserver/platform/resource/ResourceStore.java&gt;
/ Resource
<https://github.com/boundlessgeo/geoserver/blob/resource_store/src/platform/src/main/java/org/geoserver/platform/resource/Resource.java&gt;
interfaces
b) Integration with GeoServerResourceLoader
<https://github.com/boundlessgeo/geoserver/blob/resource_store/src/platform/src/main/java/org/geoserver/platform/GeoServerResourceLoader.java&gt; (this
has some QA checks in place to assist during migration)

Kevin is working on a JDBCResourceStore implementation for the jdbc
config extension (to confirm the API will work using database blobs).
--
Jody Garnett

------------------------------------------------------------------------------
Flow-based real-time traffic analytics software. Cisco certified tool.
Monitor traffic, SLAs, QoS, Medianet, WAAS etc. with NetFlow Analyzer
Customize your own dashboards, set traffic alerts and generate reports.
Network behavioral analysis & security monitoring. All-in-one tool.
http://pubads.g.doubleclick.net/gampad/clk?id=126839071&iu=/4140/ostg.clktrk

_______________________________________________
Geoserver-devel mailing list
Geoserver-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/geoserver-devel

--
Ben Caradoc-Davies <Ben.Caradoc-Davies@anonymised.com>
Software Engineer
CSIRO Earth Science and Resource Engineering
Australian Resources Research Centre

Thanks Ben, I am especially interested in your feedback as app-schema is one of the modules that relies heavily on configuration files. I am interested to ensure we can correctly handle this case.

Additional comments inline.

···

On Tue, Feb 25, 2014 at 3:55 PM, Ben Caradoc-Davies <Ben.Caradoc-Davies@anonymised.com> wrote:

Thanks, Jody, a fine piece of work. +1 from me.

Does this API support managed files whose names can only be determined from the content of other managed files? For example, app-schema datastore.xml files reference mapping files which themselves can refer to other mapping files. Would the Resource.list method allow the management of such files? I suspect this is similar situation to icons referenced by SLDs.

I have not decided on anything tricky/interesting for this case. In the SLD case referring to the “style” folder will be enough to ensure all the contents are unpacked. I imagine something similar will work for app-schema?

I put some thought into the SLD case and considered scanning the file for icon and file references (active case) or rewriting the the URLs and registering a URL handler (passive case). Did not seem worth the complexity.

What is the workflow for migrating an existing app-schema configuration into a JDBC config? If did did have to manually edit a file, how should we get it out and back in again?

Currently JDBC Config “ingests” the various catalog xml files and punts them into the database. Something similar is intended for JDBCConfigResourceStore with two differences:

  • Need to scan the directory during “load” to see if any local files have changed (this covers the case for config files that are not available via the UI or via REST)
  • Unpack a modified resource into the data directory if file based access is needed.

We may also be able to deploy a file watcher as is done for the template files, if so we will need to add notification events of some sort.

Jody

On Tue, Feb 25, 2014 at 1:22 AM, Jody Garnett <jody.garnett@anonymised.com>wrote:

Thanks for wait - proposal is now ready:

http://geoserver.org/display/GEOS/GSIP+106+-+Managed+File+API

The proposal consists of a "ResourceStore" API for streaming based access
to resources and a plan to transition the codebase.

The proposal looks good, but I'd need a bit more time to look at the patch

Cheers
Andrea

--
== Our support, Your Success! Visit http://opensdi.geo-solutions.it for
more information ==

Ing. Andrea Aime
@geowolf
Technical Lead

GeoSolutions S.A.S.
Via Poggio alle Viti 1187
55054 Massarosa (LU)
Italy
phone: +39 0584 962313
fax: +39 0584 1660272
mob: +39 339 8844549

http://www.geo-solutions.it
http://twitter.com/geosolutions_it

-------------------------------------------------------

If it helps Andre I would be happy to go over the branch with you on IRC or Skype. Just give me a shout.

···

Jody Garnett

On Thu, Feb 27, 2014 at 5:26 AM, Andrea Aime <andrea.aime@anonymised.com> wrote:

On Tue, Feb 25, 2014 at 1:22 AM, Jody Garnett <jody.garnett@anonymised.com> wrote:

Thanks for wait - proposal is now ready:

http://geoserver.org/display/GEOS/GSIP+106±+Managed+File+API

The proposal consists of a “ResourceStore” API for streaming based access to resources and a plan to transition the codebase.

The proposal looks good, but I’d need a bit more time to look at the patch

Cheers
Andrea

== Our support, Your Success! Visit http://opensdi.geo-solutions.it for more information ==

Ing. Andrea Aime

@geowolf
Technical Lead

GeoSolutions S.A.S.
Via Poggio alle Viti 1187
55054 Massarosa (LU)
Italy
phone: +39 0584 962313
fax: +39 0584 1660272
mob: +39 339 8844549

http://www.geo-solutions.it
http://twitter.com/geosolutions_it


On Tue, Feb 25, 2014 at 1:22 AM, Jody Garnett <jody.garnett@anonymised.com>wrote:

Thanks for wait - proposal is now ready:

http://geoserver.org/display/GEOS/GSIP+106+-+Managed+File+API

The proposal consists of a "ResourceStore" API for streaming based access
to resources and a plan to transition the codebase.

I am quite happy with how the interfaces have turned out. You can evaluate
for yourself here:
- https://github.com/boundlessgeo/geoserver/tree/resource_store

I did dig a bit deeper. Here are some observations.

Your question in the proposal "Is it worth scanning SLD files to determine
icons used?". Yes, I believe it is, but do not
expect it to always work, as people can use dynamic symbolizers and embed
cql expressions in the paths, in that
case, there might be a need for some way to tell GeoServer which
directories/files should be managed manually in order
for a cluster using JDBCConfig to produce proper maps on all nodes:
http://blog.geoserver.org/2008/12/08/dynamic-symbolizers-part-1/

About Ians feedback to "Consider implementing Atomic-File-Write (use file
lock, write to separate file and rename into place)",
good idea, but mind that while some NFS implementations work well, others
do not in terms of locking, so I'd suggest
to roll a LockManager interface like GWC has, and allow for other non NIO
lock based implementations to be plugged in
(e.g., a Hazelcast based one)

About http://geoserver.org/display/GEOS/ResourceStore+Design, "solutions
not based on xstream" is not
quite correct, it's more solutions that are not storing configurations in
catalog and service objects.
For example, the ogr output format module uses xstream for persistence, but
in a custom file that's outside
of the core loader clasess.

About the GeoServerResourceLoader changes, is this enum a implementation
debugging aid, or
are we going to have it long term?

  /** Mode used during transition to Resource use to verify functionality
*/
    private enum Compatibility {
        /** Supplied ResourceStore used for file access */
        RESOURCE,
        /** Use search locations to locate file */
        SEARCH,
        /** File and Resource Logic compared, exception if inconsistent. */
        STRICT };

Also, it seems GeoServerResourceLoader is not really having the
ResourceStore injected
like the doc say, I assume that still needs to be developed right?
(by the looks of it, the branch seems a proof of concept, but wanted to
make sure)

I haven't had a detailed check at the rest of the code, but overall it
seems good, +1
on the proposal

Btw, you are going to create some support/replacement for the property file
watchers yes? :slight_smile:
Think about someone changing the control flow config on a node, your code
notices
thanks to the resource API, the file gets saved in the database, and then
you said
the slave will get the file checked out automatically when trying to access
it, but
one thing needs to be maintained, the watches make sure we don't hit the
file system
too often, no more than once a second, because that's a major drag if we do
it 100
times a second or more, so we'll need something that also avoids checking
the
database 100 times a second (as that might be even slower than checking
the last modified time on the file system).

Cheers
Andrea

--
== Our support, Your Success! Visit http://opensdi.geo-solutions.it for
more information ==

Ing. Andrea Aime
@geowolf
Technical Lead

GeoSolutions S.A.S.
Via Poggio alle Viti 1187
55054 Massarosa (LU)
Italy
phone: +39 0584 962313
fax: +39 0584 1660272
mob: +39 339 8844549

http://www.geo-solutions.it
http://twitter.com/geosolutions_it

-------------------------------------------------------

Thanks Andrea.

I will add an addition point to the discussion. In reviewing GeoServerDataDirectory I am registering some regressions, test cases depending on an unforeseen execution path in the original code.

Notes: http://geoserver.org/display/GEOS/Resource+API+Transition+Plan

Additional comments inline …

···

I was aware of dynamic symbolisers and had considered registering a URL handler to “trap” relative icon references. But I like the idea of modules advertising directories that need to be unpacked, previously I had just considered asking for a list of files.

Thanks for the suggestion, I will review GWC LockManager.

I have updated the text to refer to catalog and service objects.

I was finding it handy to be able to check what the original code would of found during the transition. However it is strictly a debugging aid during the transition. When I have successfully migrated the codebase to use Resource I will remove this crutch.

That is correct, was just focused on API viability right now (since I wanted to see how much work was going to be required).

Thanks.

I understand.

Your question in the proposal “Is it worth scanning SLD files to determine icons used?”. Yes, I believe it is, but do not
expect it to always work, as people can use dynamic symbolizers and embed cql expressions in the paths, in that
case, there might be a need for some way to tell GeoServer which directories/files should be managed manually in order
for a cluster using JDBCConfig to produce proper maps on all nodes:
http://blog.geoserver.org/2008/12/08/dynamic-symbolizers-part-1/

About Ians feedback to “Consider implementing Atomic-File-Write (use file lock, write to separate file and rename into place)”,
good idea, but mind that while some NFS implementations work well, others do not in terms of locking, so I’d suggest
to roll a LockManager interface like GWC has, and allow for other non NIO lock based implementations to be plugged in
(e.g., a Hazelcast based one)

About http://geoserver.org/display/GEOS/ResourceStore+Design, “solutions not based on xstream” is not
quite correct, it’s more solutions that are not storing configurations in catalog and service objects.
For example, the ogr output format module uses xstream for persistence, but in a custom file that’s outside
of the core loader clasess.

About the GeoServerResourceLoader changes, is this enum a implementation debugging aid, or
are we going to have it long term?

/** Mode used during transition to Resource use to verify functionality /
private enum Compatibility {
/
* Supplied ResourceStore used for file access /
RESOURCE,
/
* Use search locations to locate file /
SEARCH,
/
* File and Resource Logic compared, exception if inconsistent. */
STRICT };

Also, it seems GeoServerResourceLoader is not really having the ResourceStore injected
like the doc say, I assume that still needs to be developed right?
(by the looks of it, the branch seems a proof of concept, but wanted to make sure)

I haven’t had a detailed check at the rest of the code, but overall it seems good, +1
on the proposal

Btw, you are going to create some support/replacement for the property file watchers yes? :slight_smile:
Think about someone changing the control flow config on a node, your code notices
thanks to the resource API, the file gets saved in the database, and then you said
the slave will get the file checked out automatically when trying to access it, but
one thing needs to be maintained, the watches make sure we don’t hit the file system
too often, no more than once a second, because that’s a major drag if we do it 100
times a second or more, so we’ll need something that also avoids checking the
database 100 times a second (as that might be even slower than checking
the last modified time on the file system).

Cheers

Andrea

== Our support, Your Success! Visit http://opensdi.geo-solutions.it for more information ==

Ing. Andrea Aime

@geowolf
Technical Lead

GeoSolutions S.A.S.
Via Poggio alle Viti 1187
55054 Massarosa (LU)
Italy
phone: +39 0584 962313
fax: +39 0584 1660272
mob: +39 339 8844549

http://www.geo-solutions.it
http://twitter.com/geosolutions_it


On Thu, Feb 27, 2014 at 11:35 PM, Jody Garnett <jody.garnett@anonymised.com>wrote:

Thanks Andrea.

I will add an addition point to the discussion. In reviewing
GeoServerDataDirectory I am registering some regressions, test cases
depending on an unforeseen execution path in the original code.

Notes: http://geoserver.org/display/GEOS/Resource+API+Transition+Plan

Additional comments inline ...

Your question in the proposal "Is it worth scanning SLD files to

determine icons used?". Yes, I believe it is, but do not
expect it to always work, as people can use dynamic symbolizers and embed
cql expressions in the paths, in that
case, there might be a need for some way to tell GeoServer which
directories/files should be managed manually in order
for a cluster using JDBCConfig to produce proper maps on all nodes:
http://blog.geoserver.org/2008/12/08/dynamic-symbolizers-part-1/

I was aware of dynamic symbolisers and had considered registering a URL
handler to "trap" relative icon references. But I like the idea of modules
advertising directories that need to be unpacked, previously I had just
considered asking for a list of files.

Mind, in this case it might be the "administrator" that has to advertise
the list.
For the case of icons we can probably get away by grabbing any non xml/sld
file contained in the styles directory,
or its sub-folders.

Cheers
Andrea

--
== Our support, Your Success! Visit http://opensdi.geo-solutions.it for
more information ==

Ing. Andrea Aime
@geowolf
Technical Lead

GeoSolutions S.A.S.
Via Poggio alle Viti 1187
55054 Massarosa (LU)
Italy
phone: +39 0584 962313
fax: +39 0584 1660272
mob: +39 339 8844549

http://www.geo-solutions.it
http://twitter.com/geosolutions_it

-------------------------------------------------------