On Wed, Jul 12, 2017 at 12:19 AM, Jonathan Meyer <jon@anonymised.com> wrote:
As background, I'm aware of various efforts / documentation on how to
coordinate GeoServer configuration between multiple instances:
http://docs.geoserver.org/latest/en/user/community/jms-cluster/index.html
https://boundlessgeo.com/2013/04/geoserver-in-a-clustered-
configuration-part-1/
https://2016.foss4g-na.org/sites/default/files/slides/High%20Performance%
20Geoserver%20Clusters_0.pdf
In developing a GeoServer package for Apache Mesos via DC/OS, I went down
a similar path to Derek Kern as identified in his 2016 FOSS4G-NA talk
(linked above) - mounted network storage to share GeoServer data
configuration across multiple machines.
The overall approach of the presentation is a bit .... old? it represents
the "state of the art" as of 2010, as Jody remarked other
avenues have been considered since then. To be fair, that approach is still
simple and viable if you have a master driven, low change
rate configuration (though the separate front-end GWC is something I have
not seen in a while) and with the speedups to
file system catalog loading in 2.11 it's viable even if you have a
large-ish number of layers.
At the same time, shared data dir and reload is the only approach that you
can take if you restrict yourself to supported modules
(both jms and jdbc-config are community, thus, unsupported).
While this solution is functional, it enforces a requirement on consistent
mounted data across a cluster, as well as requiring an external
coordination service to monitor configuration directory and force instances
to reload from disk. My preferred approach would be to either directly
coordinate between GeoServers or use a cluster native coordination system
(such as Zookeeper) for configuration. I have considered looking into using
the GeoServer backup/restore plugin that was recently developed to push
configuration to all other GeoServer instances within a cluster.
The backup/restore module has been developed for "full/slow" backup/restore
operations, not on the fly change notification.
Something based on zookeper would be interesting. I'd also like to play
with/develop a distributed in memory configuration based on
Hazelcast (or something similar) to see how it works, nowadays the
jdbcconfig module is taking a significant performance hit
due to the many queries it does to the config db per request, slowing down
each OGC request (Niels showed interest in
improving that, haven't heard about it since though).
Ideally, I'd like to see something easier to setup than JMS clustering,
with a performance comparable to in memory config storage
and not requiring changes to the database when the configuration object
properties change, or queries towards the catalog change
(something that jdbcconfig nowadays requires, making it hard to upgrade
[1]).
That said, the configuration needs to be stored somewhere (to support full
cluster restart at the very least), as Jody said
there are indirections in the code nowadays to allow storage on something
other than a file system, there is a community (unsupported)
module allowing storage on a relational database, to be used along
jdbcconfig.
Does any one else have experience or opinions in this domain? I'm just
brainstorming and would love to discuss this in more detail.
Been playing with all options above, yep.
Regards,
Andrea Aime
[1]: This is one annoying issue in jdbcconfig imho.
Basically, jdbcconfig stores XML blobs and maps out interesting attributes
in a separate table for indexing searches.
So, if a new property pops up that needs to be searchable for whatever
reason, one has to go and change the
jdbcconfig mappings to map it out, failing to do so will make jdbcconfig go
and de-serialize the xml blobs from db
every time a search based on the incriminated property happens.
Another issue happens if the code querying the catalog starts issuing
queries against that
are already in the stored XML, but have not been mapped out to be indexed.
There is no tooling to add the mapping and extract them from the XML blobs,
the only approach I've found is to re-import from a file system based data
dir... which is not possible once
you have been using jdbcconfig for a while and it got out of synch with the
fs based data dir.
Hopefully one has used a dbms with xml/xpath extraction support to setup
mass extraction
queries to re-align the db.
==
GeoServer Professional Services from the experts! Visit http://goo.gl/it488V
for more information.
Ing. Andrea Aime
@geowolf
Technical Lead
GeoSolutions S.A.S.
Via di Montramito 3/A
55054 Massarosa (LU)
phone: +39 0584 962313
fax: +39 0584 1660272
mob: +39 339 8844549
http://www.geo-solutions.it
http://twitter.com/geosolutions_it
AVVERTENZE AI SENSI DEL D.Lgs. 196/2003
Le informazioni contenute in questo messaggio di posta elettronica e/o
nel/i file/s allegato/i sono da considerarsi strettamente riservate. Il
loro utilizzo è consentito esclusivamente al destinatario del messaggio,
per le finalità indicate nel messaggio stesso. Qualora riceviate questo
messaggio senza esserne il destinatario, Vi preghiamo cortesemente di
darcene notizia via e-mail e di procedere alla distruzione del messaggio
stesso, cancellandolo dal Vostro sistema. Conservare il messaggio stesso,
divulgarlo anche in parte, distribuirlo ad altri soggetti, copiarlo, od
utilizzarlo per finalità diverse, costituisce comportamento contrario ai
principi dettati dal D.Lgs. 196/2003.
The information in this message and/or attachments, is intended solely for
the attention and use of the named addressee(s) and may be confidential or
proprietary in nature or covered by the provisions of privacy act
(Legislative Decree June, 30 2003, no.196 - Italy's New Data Protection
Code).Any use not in accord with its purpose, any disclosure, reproduction,
copying, distribution, or either dissemination, either whole or partial, is
strictly forbidden except previous formal approval of the named
addressee(s). If you are not the intended recipient, please contact
immediately the sender by telephone, fax or e-mail and delete the
information in this message that has been received in error. The sender
does not give any warranty or accept liability as the content, accuracy or
completeness of sent messages and accepts no responsibility for changes
made after they were sent or for other risks which arise as a result of
e-mail transmission, viruses, etc.