[Geoserver-devel] GeoServer clustering: Hitting "Reload" on the master raises exceptions on the slaves

Folks,

I have configured GeoServer Active Clustering in a master/slave configuration with embedded brokers.

The GeoServer version is 2.10.2, with Clustering and CSW extensions added. (The clustering extension was downloaded from:
http://ares.boundlessgeo.com/geoserver/2.10.x/community-latest/geoserver-2.10-SNAPSHOT-jms-cluster-plugin.zip.)

Changing the catalog on the master causes the updates to be mirrored faithfully on the slaves; however, when the "Reload" button is hit on the master, this exception is raised on the slaves:

2017-03-15 00:20:16.631Z] WARN [client.JMSContainer] - Setup of JMS message listener invoker failed for destination 'queue://Consumer.b41808d10cec.VirtualTopic.>' - trying to recover. Cause: java.lang.IllegalArgumentException: org.geoserver.wfs.WFSInfoImpl is not an interface
[2017-03-15 00:20:16.679Z] ERROR [geoserver.cluster] - class org.geoserver.cluster.impl.handlers.configuration.JMSServiceHandler is unable to synchronize the incoming event: org.geoserver.cluster.impl.events.configuration.JMSServiceModifyEvent@anonymised.com
[2017-03-15 00:20:16.680Z] WARN [client.JMSContainer] - Execution of JMS message listener failed, and no ErrorHandler has been set.
javax.jms.JMSException: java.lang.IllegalArgumentException: org.geoserver.csw.CSWInfoImpl is not an interface
         at org.geoserver.cluster.client.JMSQueueListener.onMessage(JMSQueueListener.java:149)
         at org.springframework.jms.listener.AbstractMessageListenerContainer.doInvokeListener(AbstractMessageListenerContainer.java:721)

Is this a bug (possibly an already solved one), or am I doing something wrongly?

Cheers,

Luca Morandini
Data Architect - AURIN project
Melbourne eResearch Group
Department of Computing and Information Systems
Room 3.08. Level 3, Doug McDonell Building, Parkville Campus
University of Melbourne
Tel. +61 03 903 58 380
Skype: lmorandini
LinkedIn: https://www.linkedin.com/in/lmorandini

On Wed, Mar 15, 2017 at 5:50 AM, Luca Morandini <lmorandini@anonymised.com> wrote:

Folks,

I have configured GeoServer Active Clustering in a master/slave
configuration with
embedded brokers.

The GeoServer version is 2.10.2, with Clustering and CSW extensions added.
(The
clustering extension was downloaded from:
http://ares.boundlessgeo.com/geoserver/2.10.x/community-
latest/geoserver-2.10-SNAPSHOT-jms-cluster-plugin.zip.)

Changing the catalog on the master causes the updates to be mirrored
faithfully on
the slaves; however, when the "Reload" button is hit on the master, this
exception
is raised on the slaves:

I don't know about the specific error, but the documentation clearly
indicates that a reload should never be used along with the JMS clustering
extension:

http://docs.geoserver.org/latest/en/user/community/jms-cluster/index.html#things-to-avoid

I'm a bit surprised you did not end up ruining the slave's configuration...

Cheers
Andrea

--

GeoServer Professional Services from the experts! Visit
http://goo.gl/it488V for more information.

Ing. Andrea Aime
@geowolf
Technical Lead

GeoSolutions S.A.S.
Via di Montramito 3/A
55054 Massarosa (LU)
phone: +39 0584 962313
fax: +39 0584 1660272
mob: +39 339 8844549

http://www.geo-solutions.it
http://twitter.com/geosolutions_it

*AVVERTENZE AI SENSI DEL D.Lgs. 196/2003*

Le informazioni contenute in questo messaggio di posta elettronica e/o
nel/i file/s allegato/i sono da considerarsi strettamente riservate. Il
loro utilizzo è consentito esclusivamente al destinatario del messaggio,
per le finalità indicate nel messaggio stesso. Qualora riceviate questo
messaggio senza esserne il destinatario, Vi preghiamo cortesemente di
darcene notizia via e-mail e di procedere alla distruzione del messaggio
stesso, cancellandolo dal Vostro sistema. Conservare il messaggio stesso,
divulgarlo anche in parte, distribuirlo ad altri soggetti, copiarlo, od
utilizzarlo per finalità diverse, costituisce comportamento contrario ai
principi dettati dal D.Lgs. 196/2003.

The information in this message and/or attachments, is intended solely for
the attention and use of the named addressee(s) and may be confidential or
proprietary in nature or covered by the provisions of privacy act
(Legislative Decree June, 30 2003, no.196 - Italy's New Data Protection
Code).Any use not in accord with its purpose, any disclosure, reproduction,
copying, distribution, or either dissemination, either whole or partial, is
strictly forbidden except previous formal approval of the named
addressee(s). If you are not the intended recipient, please contact
immediately the sender by telephone, fax or e-mail and delete the
information in this message that has been received in error. The sender
does not give any warranty or accept liability as the content, accuracy or
completeness of sent messages and accepts no responsibility for changes
made after they were sent or for other risks which arise as a result of
e-mail transmission, viruses, etc.

-------------------------------------------------------

On 15/03/17 18:27, Andrea Aime wrote:

On Wed, Mar 15, 2017 at 5:50 AM, Luca Morandini
<lmorandini@anonymised.com
<mailto:lmorandini@anonymised.com>> wrote:

    Changing the catalog on the master causes the updates to be mirrored faithfully on
    the slaves; however, when the "Reload" button is hit on the master, this exception
    is raised on the slaves:

I don't know about the specific error, but the documentation clearly indicates
that a reload should never be used along with the JMS clustering
extension:

Well, it says that hitting "Reload" would trigger the complete rebuild of all resources on the slaves, which would be fine when adding empty instances to the cluster, I supposed.

I'm a bit surprised you did not end up ruining the slave's configuration...

It was empty (separate data dirs), not much to ruin there.

So, when adding a new instance, shall I stop accepting catalog changes, copy the data dir to the new instance, connect the new (slave) instance to the master, and then restart accepting catalog changes?

Cheers,

Luca Morandini
Data Architect - AURIN project
Melbourne eResearch Group
Department of Computing and Information Systems
Room 3.08. Level 3, Doug McDonell Building, Parkville Campus
University of Melbourne
Tel. +61 03 903 58 380
Skype: lmorandini
LinkedIn: https://www.linkedin.com/in/lmorandini

On Wed, Mar 15, 2017 at 8:48 AM, Luca Morandini <lmorandini@anonymised.com> wrote:

> I don't know about the specific error, but the documentation clearly
indicates
> that a reload should never be used along with the JMS clustering
> extension:

Well, it says that hitting "Reload" would trigger the complete rebuild of
all
resources on the slaves, which would be fine when adding empty instances
to the
cluster, I supposed.

When it says "things to avoid" it really means it, I had data dirs busted
by doing
that by accident (training setup, I was not aware the clustering was
already on
while doing the basic parts and hitting reload willy-nilly...)

> I'm a bit surprised you did not end up ruining the slave's
configuration...

It was empty (separate data dirs), not much to ruin there.

So, when adding a new instance, shall I stop accepting catalog changes,
copy the
data dir to the new instance, connect the new (slave) instance to the
master, and
then restart accepting catalog changes?

I'm not sure, but yes, that seems like a legit approach. For a elastic
scaling approach
I'd rather go for shared data directory, only master writes, and remember
to separate
the logs.

Cheers
Andrea

--

GeoServer Professional Services from the experts! Visit
http://goo.gl/it488V for more information.

Ing. Andrea Aime
@geowolf
Technical Lead

GeoSolutions S.A.S.
Via di Montramito 3/A
55054 Massarosa (LU)
phone: +39 0584 962313
fax: +39 0584 1660272
mob: +39 339 8844549

http://www.geo-solutions.it
http://twitter.com/geosolutions_it

*AVVERTENZE AI SENSI DEL D.Lgs. 196/2003*

Le informazioni contenute in questo messaggio di posta elettronica e/o
nel/i file/s allegato/i sono da considerarsi strettamente riservate. Il
loro utilizzo è consentito esclusivamente al destinatario del messaggio,
per le finalità indicate nel messaggio stesso. Qualora riceviate questo
messaggio senza esserne il destinatario, Vi preghiamo cortesemente di
darcene notizia via e-mail e di procedere alla distruzione del messaggio
stesso, cancellandolo dal Vostro sistema. Conservare il messaggio stesso,
divulgarlo anche in parte, distribuirlo ad altri soggetti, copiarlo, od
utilizzarlo per finalità diverse, costituisce comportamento contrario ai
principi dettati dal D.Lgs. 196/2003.

The information in this message and/or attachments, is intended solely for
the attention and use of the named addressee(s) and may be confidential or
proprietary in nature or covered by the provisions of privacy act
(Legislative Decree June, 30 2003, no.196 - Italy's New Data Protection
Code).Any use not in accord with its purpose, any disclosure, reproduction,
copying, distribution, or either dissemination, either whole or partial, is
strictly forbidden except previous formal approval of the named
addressee(s). If you are not the intended recipient, please contact
immediately the sender by telephone, fax or e-mail and delete the
information in this message that has been received in error. The sender
does not give any warranty or accept liability as the content, accuracy or
completeness of sent messages and accepts no responsibility for changes
made after they were sent or for other risks which arise as a result of
e-mail transmission, viruses, etc.

-------------------------------------------------------

Hi,

···

On 15 March 2017 at 08:47, Andrea Aime <andrea.aime@anonymised.com> wrote:

I also interpreted that documentation as being “it will take ages” rather than “it will break things”.

Presumably stopping & starting the master instance will effectively trigger a complete rebuild of all slave resources too during catalog startup? Or is reload special somehow?

Thanks,

Rob :slight_smile:

On Wed, Mar 15, 2017 at 8:48 AM, Luca Morandini <lmorandini@anonymised.com> wrote:

I don’t know about the specific error, but the documentation clearly indicates
that a reload should never be used along with the JMS clustering
extension:

Well, it says that hitting “Reload” would trigger the complete rebuild of all
resources on the slaves, which would be fine when adding empty instances to the
cluster, I supposed.

When it says “things to avoid” it really means it, I had data dirs busted by doing
that by accident (training setup, I was not aware the clustering was already on
while doing the basic parts and hitting reload willy-nilly…)

On Wed, Mar 15, 2017 at 12:15 PM, Robert Coup <robert.coup@anonymised.com>
wrote:

When it says "things to avoid" it really means it, I had data dirs busted

by doing
that by accident (training setup, I was not aware the clustering was
already on
while doing the basic parts and hitting reload willy-nilly...)

I also interpreted that documentation as being "it will take ages" rather
than "it will break things".

Pull requests to reword that welcomed :-p

Cheers
Andrea

--

GeoServer Professional Services from the experts! Visit
http://goo.gl/it488V for more information.

Ing. Andrea Aime
@geowolf
Technical Lead

GeoSolutions S.A.S.
Via di Montramito 3/A
55054 Massarosa (LU)
phone: +39 0584 962313
fax: +39 0584 1660272
mob: +39 339 8844549

http://www.geo-solutions.it
http://twitter.com/geosolutions_it

*AVVERTENZE AI SENSI DEL D.Lgs. 196/2003*

Le informazioni contenute in questo messaggio di posta elettronica e/o
nel/i file/s allegato/i sono da considerarsi strettamente riservate. Il
loro utilizzo è consentito esclusivamente al destinatario del messaggio,
per le finalità indicate nel messaggio stesso. Qualora riceviate questo
messaggio senza esserne il destinatario, Vi preghiamo cortesemente di
darcene notizia via e-mail e di procedere alla distruzione del messaggio
stesso, cancellandolo dal Vostro sistema. Conservare il messaggio stesso,
divulgarlo anche in parte, distribuirlo ad altri soggetti, copiarlo, od
utilizzarlo per finalità diverse, costituisce comportamento contrario ai
principi dettati dal D.Lgs. 196/2003.

The information in this message and/or attachments, is intended solely for
the attention and use of the named addressee(s) and may be confidential or
proprietary in nature or covered by the provisions of privacy act
(Legislative Decree June, 30 2003, no.196 - Italy's New Data Protection
Code).Any use not in accord with its purpose, any disclosure, reproduction,
copying, distribution, or either dissemination, either whole or partial, is
strictly forbidden except previous formal approval of the named
addressee(s). If you are not the intended recipient, please contact
immediately the sender by telephone, fax or e-mail and delete the
information in this message that has been received in error. The sender
does not give any warranty or accept liability as the content, accuracy or
completeness of sent messages and accepts no responsibility for changes
made after they were sent or for other risks which arise as a result of
e-mail transmission, viruses, etc.

-------------------------------------------------------

···

On 15 March 2017 at 11:29, Andrea Aime <andrea.aime@anonymised.com> wrote:

I can.

Current wording is a bit more bold that I remember:

On Wed, Mar 15, 2017 at 12:15 PM, Robert Coup <robert.coup@anonymised.com> wrote:

Pull requests to reword that welcomed :-p

I also interpreted that documentation as being “it will take ages” rather than “it will break things”.

When it says “things to avoid” it really means it, I had data dirs busted by doing

that by accident (training setup, I was not aware the clustering was already on
while doing the basic parts and hitting reload willy-nilly…)

NEVER RELOAD THE GEOSERVER CATALOG ON A MASTER: Each master instance should never call the catalog reload since this propagates the creation of all the resources, styles, etc to all the connected slaves.

Followup question:

With a master-writes and shared data-directory/DB approach… restarting the master seems fine. But reloading will cause issues? Making some assumptions:

  1. reload doesn’t pass a “destroy” message across the cluster, so on a slave everything is created on top of the previous objects?
  2. normal webapp startup suppresses create messages across the cluster?

But if you have slaves that can write to a shared data-directory/DB, seems like any catalog edits will make you have a bad day?

Thanks,

Rob :slight_smile: