What you need is:
1. A common configuration
2. Something up front to distribute the load across the computers in the cluster
The recent "catalog proposal" is step one (you want to get all the geoserver instances to read their configuration from a shared database). The second step is why we sign up for this Java Enterprise Edition idea - the details of load balancing is left up to your application server (our job ends with providing a geoserver war with enough configuration options declared so you can set it up - as an example we would need to let you provide the name of the database holding the shared configuration).
Jody
Hi, there is any chance to convert geoserver to run with parallel processing....
I've no clue in what it's needed to convert geoserver in a grid-aware application.
Thanks in advance
Facundo.-
ps: sorry my english... and is worst than mi spanish...
------------------------------------------------------------------------
-------------------------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems? Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >> http://get.splunk.com/
------------------------------------------------------------------------
I've no clue in what it's needed to convert geoserver in a grid-aware application.
We never looked into grid computing so far, and clustering, whilst
already possible, it's clumsy.
The problem lies, as Jody noted, in the configuration: it's a set of
xml files sitting on a disk, and fully loaded by GeoServer in memory
on startup for performance reasons.
This means you can share the configuration using a network disk, but
you have to kick all of the GeoServer instances (force them to
reload the config) each time the files are modified. Often people
do write some network script to force each GeoServer instance to upgrade.
Now that I think about it, it would be possible to write a simple
polling thread in GeoServer that looks at the last modification
date of the configuration file and reloads them if they happen
to have been modified since last reload. By putting the polling thread
out of the data serving path the overhead would be minimal (not
even measurable I think).
Once you have this, you just need to put a load balancer in front
of a GeoServer cluster and you're gold.
One thing I eventually want to check out is hadoop - http://lucene.apache.org/hadoop/ to figure out if we can use map reduce to make huge amounts of tiles on large clusters. But yeah, the parallel processing stuff is great RnD stuff, but not high enough on the priority lists right now. I wonder if we could try to find some university student that could investigate it for their masters, since it would be a very cool project.
Chris
Andrea Aime wrote:
Facundo Garat ha scritto:
Hi, there is any chance to convert geoserver to run with parallel processing....
I've no clue in what it's needed to convert geoserver in a grid-aware application.
We never looked into grid computing so far, and clustering, whilst
already possible, it's clumsy.
The problem lies, as Jody noted, in the configuration: it's a set of
xml files sitting on a disk, and fully loaded by GeoServer in memory
on startup for performance reasons.
This means you can share the configuration using a network disk, but
you have to kick all of the GeoServer instances (force them to
reload the config) each time the files are modified. Often people
do write some network script to force each GeoServer instance to upgrade.
Now that I think about it, it would be possible to write a simple
polling thread in GeoServer that looks at the last modification
date of the configuration file and reloads them if they happen
to have been modified since last reload. By putting the polling thread
out of the data serving path the overhead would be minimal (not
even measurable I think).
Once you have this, you just need to put a load balancer in front
of a GeoServer cluster and you're gold.
Cheers
Andrea
-------------------------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems? Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >> http://get.splunk.com/
_______________________________________________
Geoserver-users mailing list
Geoserver-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/geoserver-users
Maybe to overcome this problem it would be nice to configure one geoserver as master configuration and the other nodes in the cluster could "ask" (as slave) for the configuration to the master.
Coding one directly into a master GeoServer could make things easier
if you're not familiar with Apache, but making a load balancer
that really delivers is no easy task (code must be very carefully
crafted to avoid the balancer to become a network bottleneck).
Or use some kind of replication for the configuration between the nodes in a cluster like OsCache do for the cached files.