[Geoserver-devel] Sharing GUI and authentication state across a GeoServer cluster

Hi,
we are considering building a module to help using geoserver clustered installation
from a administration and security handling point of view.

The issues we’re trying to handle:

  • Make sure the user http session is shared across the cluster, for example,
    CAS is using it to store information
  • All the same, Wicket makes some use of the session, so to share the
    GUI interactions with a load balanced cluster a http session clustering is needed
  • Wicket wise we also need a shared page store, otherwise long interactions using
    non bookmarkable pages and ajax might fail to have their state properly shared across
    the board (page versions and the like)

Theoretically, ResourcePool refresh might also be needed if resources are updated
on one node, but this might or might not be necessary depending on how configuration
clustering is done

Now, implementation wise we were thinking of using Hazelcast, mostly because it
makes development easy, it allows for auto-discovery of nodes using multicast, and because
there are examples of how to do the above things already available on the net
with simple google-ing :wink:
Not all enviroments allow for multicast, so probably we’ll need, either now or in the future,
to also allow for configuring the list of IP that have to build a cluster (to use normal TCP
messages instead of multicast).

What do you think?

Cheers
Andrea

==
Meet us at GEO Business 2014! in London! Visit http://goo.gl/fES3aK
for more information.

Ing. Andrea Aime

@geowolf
Technical Lead

GeoSolutions S.A.S.
Via Poggio alle Viti 1187
55054 Massarosa (LU)
Italy
phone: +39 0584 962313
fax: +39 0584 1660272
mob: +39 339 8844549

http://www.geo-solutions.it
http://twitter.com/geosolutions_it


Hi,

I’ve got experience with using Hazelcast for in-memory clusters. I’ve got nothing bad to say about it, it has worked like a charm. But you do need to be more diligent when making changes to the object classes you will be storing in the shared memory structure. It’s especially important to note in changelogs whether the new version is compatible with the previous one. It might be worth keeping the configuration classes serializable between micro releases. This would allow got a rolling upgrade of a geoserver cluster within a minor release. The new GeoServer release cycle should make this a relatively easy goal to meet.

As you said, multicast is not widely supported. There will also be people who want to distribute their cluster over separate networks: like building a cluster in AWS with servers in multiple availability zones. You will probably run into big issues getting multicast working reliably in any public cloud setting. Due to the multicast issues, I’ve used direct TCP communications. In my case, the nodes find each other by using a common directory service which provides node locations. One possible option for discovery might be to have the load balancer tell each geoserver who the others are. To make this generic, you could point each geoserver to a URI which contains cluster configuration (mainly information where the other nodes are).

Also note that you need to figure out how to configure the shared password for the memory structure. You don’t want people running production servers with a default password. As clusters are not so common, you might want to force the administrator to input a password to enable the clustering.

One more thing: the clustering requires the server to listen to a separate TCP port - at least when I was working on this, you couldn’t leverage the servlet container to handle clustering traffic. This means this will complicate the network specs of the geoservers a bit.

Sampo

···

On Tue, Apr 1, 2014 at 8:56 PM, Andrea Aime <andrea.aime@anonymised.com> wrote:

Hi,
we are considering building a module to help using geoserver clustered installation
from a administration and security handling point of view.

The issues we’re trying to handle:

  • Make sure the user http session is shared across the cluster, for example,
    CAS is using it to store information
  • All the same, Wicket makes some use of the session, so to share the
    GUI interactions with a load balanced cluster a http session clustering is needed
  • Wicket wise we also need a shared page store, otherwise long interactions using
    non bookmarkable pages and ajax might fail to have their state properly shared across
    the board (page versions and the like)

Theoretically, ResourcePool refresh might also be needed if resources are updated
on one node, but this might or might not be necessary depending on how configuration
clustering is done

Now, implementation wise we were thinking of using Hazelcast, mostly because it
makes development easy, it allows for auto-discovery of nodes using multicast, and because
there are examples of how to do the above things already available on the net
with simple google-ing :wink:
Not all enviroments allow for multicast, so probably we’ll need, either now or in the future,
to also allow for configuring the list of IP that have to build a cluster (to use normal TCP
messages instead of multicast).

What do you think?

Cheers
Andrea

==
Meet us at GEO Business 2014! in London! Visit http://goo.gl/fES3aK
for more information.

Ing. Andrea Aime

@geowolf
Technical Lead

GeoSolutions S.A.S.
Via Poggio alle Viti 1187
55054 Massarosa (LU)
Italy
phone: +39 0584 962313
fax: +39 0584 1660272
mob: +39 339 8844549

http://www.geo-solutions.it
http://twitter.com/geosolutions_it




Geoserver-devel mailing list
Geoserver-devel@anonymised.comsts.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/geoserver-devel

Sampo Savolainen
R&D Director, Spatineo Oy
sampo.savolainen@anonymised.com
+358-407555649
Linnankoskenkatu 16 A 17, 00250 Helsinki, Finland
www.spatineo.com, twitter.com/#!/spatineo
www.linkedin.com/company/spatineo-inc

This message may contain privileged and/or confidential information. If you
have received this e-mail in error or are not the intended recipient, you
may not use, copy, disseminate, or distribute it; do not open any
attachments, delete it immediately from your system and notify the sender
promptly by e-mail that you have done so.

Hi Sampo,
valuable information, thank you. I’ve cc’ed the list for everybody’s benefit

Cheers
Andrea

···

On Fri, Apr 4, 2014 at 7:52 AM, Sampo Savolainen <sampo.savolainen@anonymised.com> wrote:

Hi,

I’ve got experience with using Hazelcast for in-memory clusters. I’ve got nothing bad to say about it, it has worked like a charm. But you do need to be more diligent when making changes to the object classes you will be storing in the shared memory structure. It’s especially important to note in changelogs whether the new version is compatible with the previous one. It might be worth keeping the configuration classes serializable between micro releases. This would allow got a rolling upgrade of a geoserver cluster within a minor release. The new GeoServer release cycle should make this a relatively easy goal to meet.

As you said, multicast is not widely supported. There will also be people who want to distribute their cluster over separate networks: like building a cluster in AWS with servers in multiple availability zones. You will probably run into big issues getting multicast working reliably in any public cloud setting. Due to the multicast issues, I’ve used direct TCP communications. In my case, the nodes find each other by using a common directory service which provides node locations. One possible option for discovery might be to have the load balancer tell each geoserver who the others are. To make this generic, you could point each geoserver to a URI which contains cluster configuration (mainly information where the other nodes are).

Also note that you need to figure out how to configure the shared password for the memory structure. You don’t want people running production servers with a default password. As clusters are not so common, you might want to force the administrator to input a password to enable the clustering.

One more thing: the clustering requires the server to listen to a separate TCP port - at least when I was working on this, you couldn’t leverage the servlet container to handle clustering traffic. This means this will complicate the network specs of the geoservers a bit.

Sampo

==
Meet us at GEO Business 2014! in London! Visit http://goo.gl/fES3aK
for more information.

Ing. Andrea Aime

@geowolf
Technical Lead

GeoSolutions S.A.S.
Via Poggio alle Viti 1187
55054 Massarosa (LU)
Italy
phone: +39 0584 962313
fax: +39 0584 1660272
mob: +39 339 8844549

http://www.geo-solutions.it
http://twitter.com/geosolutions_it


On Tue, Apr 1, 2014 at 8:56 PM, Andrea Aime <andrea.aime@anonymised.com…> wrote:

Hi,
we are considering building a module to help using geoserver clustered installation
from a administration and security handling point of view.

The issues we’re trying to handle:

  • Make sure the user http session is shared across the cluster, for example,
    CAS is using it to store information
  • All the same, Wicket makes some use of the session, so to share the
    GUI interactions with a load balanced cluster a http session clustering is needed
  • Wicket wise we also need a shared page store, otherwise long interactions using
    non bookmarkable pages and ajax might fail to have their state properly shared across
    the board (page versions and the like)

Theoretically, ResourcePool refresh might also be needed if resources are updated
on one node, but this might or might not be necessary depending on how configuration
clustering is done

Now, implementation wise we were thinking of using Hazelcast, mostly because it
makes development easy, it allows for auto-discovery of nodes using multicast, and because
there are examples of how to do the above things already available on the net
with simple google-ing :wink:
Not all enviroments allow for multicast, so probably we’ll need, either now or in the future,
to also allow for configuring the list of IP that have to build a cluster (to use normal TCP
messages instead of multicast).

What do you think?

Cheers
Andrea

==
Meet us at GEO Business 2014! in London! Visit http://goo.gl/fES3aK
for more information.

Ing. Andrea Aime

@geowolf
Technical Lead

GeoSolutions S.A.S.
Via Poggio alle Viti 1187
55054 Massarosa (LU)
Italy
phone: +39 0584 962313
fax: +39 0584 1660272
mob: +39 339 8844549

http://www.geo-solutions.it
http://twitter.com/geosolutions_it




Geoserver-devel mailing list
Geoserver-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/geoserver-devel

Sampo Savolainen
R&D Director, Spatineo Oy
sampo.savolainen@anonymised.com
+358-407555649
Linnankoskenkatu 16 A 17, 00250 Helsinki, Finland
www.spatineo.com, twitter.com/#!/spatineo
www.linkedin.com/company/spatineo-inc

This message may contain privileged and/or confidential information. If you
have received this e-mail in error or are not the intended recipient, you
may not use, copy, disseminate, or distribute it; do not open any
attachments, delete it immediately from your system and notify the sender
promptly by e-mail that you have done so.

HI,

I agree with Sampo about the discovery bit. My team was designing a similar distributed service system with .NET and WCF several years ago.

We implemented a “Broker” that was configured to maintain a registry of available servers. As each server came on line, part of the start-up process sent a message to 0 to many brokers it was alive and what service capabilities it allowed. Client systems would be configured for 0 to many brokers as well. When a client required a cloud service, a “Who can service request for Service X?” question was sent to the first configured broker. The reply would contain the url of the endpoint that could service the request.

All of our services were configured for standard http(s) as we had port use restrictions in place. Perhaps a restricted REST endpoint could be put in place in a GeoServer cluster to manage the cluster traffic?

Chris Snider

Senior Software Engineer

Intelligent Software Solutions, Inc.

Direct (719) 452-7257

Description: Description: Description: cid:image001.png@anonymised.com

···

Hi Sampo,

valuable information, thank you. I’ve cc’ed the list for everybody’s benefit

Cheers

Andrea

On Fri, Apr 4, 2014 at 7:52 AM, Sampo Savolainen <sampo.savolainen@anonymised.com> wrote:

Hi,

I’ve got experience with using Hazelcast for in-memory clusters. I’ve got nothing bad to say about it, it has worked like a charm. But you do need to be more diligent when making changes to the object classes you will be storing in the shared memory structure. It’s especially important to note in changelogs whether the new version is compatible with the previous one. It might be worth keeping the configuration classes serializable between micro releases. This would allow got a rolling upgrade of a geoserver cluster within a minor release. The new GeoServer release cycle should make this a relatively easy goal to meet.

As you said, multicast is not widely supported. There will also be people who want to distribute their cluster over separate networks: like building a cluster in AWS with servers in multiple availability zones. You will probably run into big issues getting multicast working reliably in any public cloud setting. Due to the multicast issues, I’ve used direct TCP communications. In my case, the nodes find each other by using a common directory service which provides node locations. One possible option for discovery might be to have the load balancer tell each geoserver who the others are. To make this generic, you could point each geoserver to a URI which contains cluster configuration (mainly information where the other nodes are).

Also note that you need to figure out how to configure the shared password for the memory structure. You don’t want people running production servers with a default password. As clusters are not so common, you might want to force the administrator to input a password to enable the clustering.

One more thing: the clustering requires the server to listen to a separate TCP port - at least when I was working on this, you couldn’t leverage the servlet container to handle clustering traffic. This means this will complicate the network specs of the geoservers a bit.

Sampo

On Tue, Apr 1, 2014 at 8:56 PM, Andrea Aime <andrea.aime@anonymised.com…> wrote:

Hi,

we are considering building a module to help using geoserver clustered installation

from a administration and security handling point of view.

The issues we’re trying to handle:

  • Make sure the user http session is shared across the cluster, for example,

CAS is using it to store information

  • All the same, Wicket makes some use of the session, so to share the

GUI interactions with a load balanced cluster a http session clustering is needed

  • Wicket wise we also need a shared page store, otherwise long interactions using

non bookmarkable pages and ajax might fail to have their state properly shared across

the board (page versions and the like)

Theoretically, ResourcePool refresh might also be needed if resources are updated

on one node, but this might or might not be necessary depending on how configuration

clustering is done

Now, implementation wise we were thinking of using Hazelcast, mostly because it

makes development easy, it allows for auto-discovery of nodes using multicast, and because

there are examples of how to do the above things already available on the net

with simple google-ing :wink:

Not all enviroments allow for multicast, so probably we’ll need, either now or in the future,

to also allow for configuring the list of IP that have to build a cluster (to use normal TCP

messages instead of multicast).

What do you think?

Cheers

Andrea

==

Meet us at GEO Business 2014! in London! Visit http://goo.gl/fES3aK

for more information.

==

Ing. Andrea Aime

@geowolf

Technical Lead

GeoSolutions S.A.S.

Via Poggio alle Viti 1187

55054 Massarosa (LU)

Italy

phone: +39 0584 962313

fax: +39 0584 1660272

mob: +39 339 8844549

http://www.geo-solutions.it

http://twitter.com/geosolutions_it




Geoserver-devel mailing list
Geoserver-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/geoserver-devel

Sampo Savolainen

R&D Director, Spatineo Oy

sampo.savolainen@anonymised.com

+358-407555649

Linnankoskenkatu 16 A 17, 00250 Helsinki, Finland

www.spatineo.com, twitter.com/#!/spatineo

www.linkedin.com/company/spatineo-inc

This message may contain privileged and/or confidential information. If you
have received this e-mail in error or are not the intended recipient, you
may not use, copy, disseminate, or distribute it; do not open any
attachments, delete it immediately from your system and notify the sender
promptly by e-mail that you have done so.

==

Meet us at GEO Business 2014! in London! Visit http://goo.gl/fES3aK

for more information.

==

Ing. Andrea Aime

@geowolf

Technical Lead

GeoSolutions S.A.S.

Via Poggio alle Viti 1187

55054 Massarosa (LU)

Italy

phone: +39 0584 962313

fax: +39 0584 1660272

mob: +39 339 8844549

http://www.geo-solutions.it

http://twitter.com/geosolutions_it