[Geoserver-devel] clustering / db config

Hello,

In order to facilitate clustering, we are planning to get rid of ALL remaining file-based configurations in geoserver and move them to a db, like was already done for the catalog with jdbcconfig. Not entirely clear yet what "all" entails yet, but I'm doing some research and investigating different possible approaches and then come up with a realistic scope.

Currently we have the interface ResourceStore which forms the basis for retrieving configurations and data from the data directory. This interface assumes a directory structure. I think the least intrusive approach would be to emulate a directory structure in a relational database, with the files as blobs, and then create a different implementation of ResourceStore to access it. I'll call this plan (A).

Alternatively, we could create a whole new configuration API which allows a more structural and efficient usage of the relational database to store and retrieve configuration settings. I'll call this plan (B). This plan just seems way too big of a change to me.

Issues with plan (A):
* Although the Resource interface offers an inputstream and outputstream through its API, most code does not use these functions; but rather accesses the file system directly with FileInputStream and FileOutputStream. The ResourceLoader is often only used for one thing and that is to get paths from the data directory. This is the case for core code such as the security system, GWC, and extensions such as CSW and XSLT (and properly a lot more than I have found so far). Still, these modifications would still be far less than would be the case in plan (B).

* The data directory is not only used for settings but also for temporary files (for example WPS). It would perhaps be strange/undesirable/problematic if the database was used for this.

* What about configurations read from geotools. For example, the application schema extension is implemented in geotools, and the location of the app-schema specific config files are specified as a path from the store configuration. If there was a solution for that, perhaps an interesting side effect would be that even the data itself could be stored this way. For example, one could upload a shapefile through rest which would then be stored in the db.

Either way this seems like a huge change, that will need to a lot of discussion. Perhaps some people have a completely different and better idea. Any suggestions, opinions ... ?

Regards
Niels

Yep that is the design, see original proposal and talk to Kevin who started a JDBCresourceStore he should be able to share with you.

On a related not we can still hunt down some file use and change it to use inputstream for direct blob access.

On Wed, Jul 8, 2015 at 12:56 PM Niels Charlier <niels@anonymised.com> wrote:

Hello,

In order to facilitate clustering, we are planning to get rid of ALL
remaining file-based configurations in geoserver and move them to a db,
like was already done for the catalog with jdbcconfig. Not entirely
clear yet what “all” entails yet, but I’m doing some research and
investigating different possible approaches and then come up with a
realistic scope.

Currently we have the interface ResourceStore which forms the basis for
retrieving configurations and data from the data directory. This
interface assumes a directory structure. I think the least intrusive
approach would be to emulate a directory structure in a relational
database, with the files as blobs, and then create a different
implementation of ResourceStore to access it. I’ll call this plan (A).

Alternatively, we could create a whole new configuration API which
allows a more structural and efficient usage of the relational database
to store and retrieve configuration settings. I’ll call this plan (B).
This plan just seems way too big of a change to me.

Issues with plan (A):

  • Although the Resource interface offers an inputstream and outputstream
    through its API, most code does not use these functions; but rather
    accesses the file system directly with FileInputStream and
    FileOutputStream. The ResourceLoader is often only used for one thing
    and that is to get paths from the data directory. This is the case for
    core code such as the security system, GWC, and extensions such as CSW
    and XSLT (and properly a lot more than I have found so far). Still,
    these modifications would still be far less than would be the case in
    plan (B).

  • The data directory is not only used for settings but also for
    temporary files (for example WPS). It would perhaps be
    strange/undesirable/problematic if the database was used for this.

  • What about configurations read from geotools. For example, the
    application schema extension is implemented in geotools, and the
    location of the app-schema specific config files are specified as a path
    from the store configuration. If there was a solution for that, perhaps
    an interesting side effect would be that even the data itself could be
    stored this way. For example, one could upload a shapefile through rest
    which would then be stored in the db.

Either way this seems like a huge change, that will need to a lot of
discussion. Perhaps some people have a completely different and better
idea. Any suggestions, opinions … ?

Regards
Niels


Don’t Limit Your Business. Reach for the Cloud.
GigeNET’s Cloud Solutions provide you with the tools and support that
you need to offload your IT needs and focus on growing your business.
Configured For All Businesses. Start Your Cloud Today.
https://www.gigenetcloud.com/


Geoserver-devel mailing list
Geoserver-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/geoserver-devel


Jody Garnett

Ah okay I didn't know a JDBCResourceStore was already in development! I was only aware of the API in place that allows it.

Will need to talk to @Kevin about this then

Cheers
Niels

On 09-07-15 03:05, Jody Garnett wrote:

Yep that is the design, see original proposal and talk to Kevin who started a JDBCresourceStore he should be able to share with you.

On a related not we can still hunt down some file use and change it to use inputstream for direct blob access.
On Wed, Jul 8, 2015 at 12:56 PM Niels Charlier <niels@anonymised.com <mailto:niels@anonymised.com>> wrote:

    Hello,

    In order to facilitate clustering, we are planning to get rid of ALL
    remaining file-based configurations in geoserver and move them to
    a db,
    like was already done for the catalog with jdbcconfig. Not entirely
    clear yet what "all" entails yet, but I'm doing some research and
    investigating different possible approaches and then come up with a
    realistic scope.

    Currently we have the interface ResourceStore which forms the
    basis for
    retrieving configurations and data from the data directory. This
    interface assumes a directory structure. I think the least intrusive
    approach would be to emulate a directory structure in a relational
    database, with the files as blobs, and then create a different
    implementation of ResourceStore to access it. I'll call this plan (A).

    Alternatively, we could create a whole new configuration API which
    allows a more structural and efficient usage of the relational
    database
    to store and retrieve configuration settings. I'll call this plan (B).
    This plan just seems way too big of a change to me.

    Issues with plan (A):
    * Although the Resource interface offers an inputstream and
    outputstream
    through its API, most code does not use these functions; but rather
    accesses the file system directly with FileInputStream and
    FileOutputStream. The ResourceLoader is often only used for one thing
    and that is to get paths from the data directory. This is the case for
    core code such as the security system, GWC, and extensions such as CSW
    and XSLT (and properly a lot more than I have found so far). Still,
    these modifications would still be far less than would be the case in
    plan (B).

    * The data directory is not only used for settings but also for
    temporary files (for example WPS). It would perhaps be
    strange/undesirable/problematic if the database was used for this.

    * What about configurations read from geotools. For example, the
    application schema extension is implemented in geotools, and the
    location of the app-schema specific config files are specified as
    a path
    from the store configuration. If there was a solution for that,
    perhaps
    an interesting side effect would be that even the data itself could be
    stored this way. For example, one could upload a shapefile through
    rest
    which would then be stored in the db.

    Either way this seems like a huge change, that will need to a lot of
    discussion. Perhaps some people have a completely different and better
    idea. Any suggestions, opinions ... ?

    Regards
    Niels

    ------------------------------------------------------------------------------
    Don't Limit Your Business. Reach for the Cloud.
    GigeNET's Cloud Solutions provide you with the tools and support that
    you need to offload your IT needs and focus on growing your business.
    Configured For All Businesses. Start Your Cloud Today.
    https://www.gigenetcloud.com/
    _______________________________________________
    Geoserver-devel mailing list
    Geoserver-devel@lists.sourceforge.net
    <mailto:Geoserver-devel@lists.sourceforge.net>
    https://lists.sourceforge.net/lists/listinfo/geoserver-devel

--
Jody Garnett