[Geoserver-users] GeoServer in Production environment

Hi,
I have a couple questions about best practices or experience with running
GeoServer in a production environment.
First, what is the best way to make sure every container has the same
configuration and data? Just modifying one and copying the data over to the
other ones? Or is there any way to use the same data directory for all? (I
think I've seen a discussion about this at some point but couldn't find it
anymore).
Second, what are the experiences of how many different containers to use for
GeoServer for lets say 15, 50 or 100 users at a time?

Thanks,
Julian
--
View this message in context: http://www.nabble.com/GeoServer-in-Production-environment-tp25879755p25879755.html
Sent from the GeoServer - User mailing list archive at Nabble.com.

jberti wrote:

Hi,
I have a couple questions about best practices or experience with running
GeoServer in a production environment.
First, what is the best way to make sure every container has the same
configuration and data? Just modifying one and copying the data over to
the other ones? Or is there any way to use the same data directory for
all? (I think I've seen a discussion about this at some point but couldn't
find it anymore).
Second, what are the experiences of how many different containers to use
for GeoServer for lets say 15, 50 or 100 users at a time?

Julian,

I'm not sure what to answer on sharing the config - other than I'd suggest
making the changes on a test site, then copying them to the production site.

Load wise, it depends on what you are doing. If you are tiling (google maps
/ virtual earth) and don't hit many cached tiles, you can drag down the
server CPU wise rather quickly (it's just working really hard to draw them
all). Same with WMS. With our rather complex shapes, and merged 30ish vector
layers of data, we get a draw rate of about 5 tiles/second per CPU. Cached
tiles are almost free (just spool the file).

On the other hand, if you mostly do WFS or KML data streaming, it uses
almost no CPU time at all. Bandwidth and/or database lookup speed are more
important here, IMHO.

We use GS on a 8-core server backed up by a 16-core database server. We
currently provide WFS/WMS/WCS and KML support from our live production data
using views. FME and our primary SaaS application shares the app server as
well. Combined, we rarely touch more than 25% of the available CPU power on
the box. Network bandwidth and then data gather time is more of the
bottleneck for us.

Bryan
--
View this message in context: http://www.nabble.com/GeoServer-in-Production-environment-tp25879755p25882786.html
Sent from the GeoServer - User mailing list archive at Nabble.com.

Since the config is only rarely pulled from disk, you might consider a disk mount that allows for each container to access the very same files (e.g. NFS or a more modern equivalent). The 2.x series brings the possibility of an RDBMS-persisted config if you're willing to build that module. (I found it straightforward.)

As far as scaling, you'll have to consider load metrics (what does 50 users actually imply for demand? what kinds of requests are they making?), caching, hardware, backend etc. Simplistic JEE horizontal scaling is a relatively expensive way to get results, and often flat-out useless. Bryan's remarks are right on target.

---
A. Soroka
Digital Research and Scholarship R & D
the University of Virginia Library

On Oct 13, 2009, at 7:31 PM, bryanhall wrote:

jberti wrote:

Hi,
I have a couple questions about best practices or experience with running
GeoServer in a production environment.
First, what is the best way to make sure every container has the same
configuration and data? Just modifying one and copying the data over to
the other ones? Or is there any way to use the same data directory for
all? (I think I've seen a discussion about this at some point but couldn't
find it anymore).
Second, what are the experiences of how many different containers to use
for GeoServer for lets say 15, 50 or 100 users at a time?

Julian,

I'm not sure what to answer on sharing the config - other than I'd suggest
making the changes on a test site, then copying them to the production site.

Load wise, it depends on what you are doing. If you are tiling (google maps
/ virtual earth) and don't hit many cached tiles, you can drag down the
server CPU wise rather quickly (it's just working really hard to draw them
all). Same with WMS. With our rather complex shapes, and merged 30ish vector
layers of data, we get a draw rate of about 5 tiles/second per CPU. Cached
tiles are almost free (just spool the file).

On the other hand, if you mostly do WFS or KML data streaming, it uses
almost no CPU time at all. Bandwidth and/or database lookup speed are more
important here, IMHO.

We use GS on a 8-core server backed up by a 16-core database server. We
currently provide WFS/WMS/WCS and KML support from our live production data
using views. FME and our primary SaaS application shares the app server as
well. Combined, we rarely touch more than 25% of the available CPU power on
the box. Network bandwidth and then data gather time is more of the
bottleneck for us.

Bryan
--
View this message in context: http://www.nabble.com/GeoServer-in-Production-environment-tp25879755p25882786.html
Sent from the GeoServer - User mailing list archive at Nabble.com.

------------------------------------------------------------------------------
Come build with us! The BlackBerry(R) Developer Conference in SF, CA
is the only developer event you need to attend this year. Jumpstart your
developing skills, take BlackBerry mobile applications to market and stay
ahead of the curve. Join us from November 9 - 12, 2009. Register now!
http://p.sf.net/sfu/devconference
_______________________________________________
Geoserver-users mailing list
Geoserver-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/geoserver-users

Sorry for bringing this topic up again. But I went around and tried to gather
more infos on our project to give you the specifics and maybe clarify some
things as I noticed my post didn't include many details about our setup. So
I wanted to add them as I think it might help finding any specific
configuration issues or ideas.
We are running 2 GeoServers (V 1.7.6) in 2 Tomcat (5.5.23) containers which
are load balanced by Apache (2.2.10). Our datastore is Oracle 10g and the
delivery method we use is WMS (KML might be looked in in the future). The
configuration is running in a virtual environment running on a Redhat Server
4 32-bit OS with a 2.66Ghz processor assigned to it and 3Gb Ram.
As our data is dynamic in nature, we choose not to cache it.

If anybody might have any ideas of what might be a good way to improve this
configuration I would greatly appreciate it.

We will try the way you mentioned to share the config.
Thanks again for the prior responses.

Thanks,
Julian

ajs6f@anonymised.com wrote:

Since the config is only rarely pulled from disk, you might consider a
disk mount that allows for each container to access the very same
files (e.g. NFS or a more modern equivalent). The 2.x series brings
the possibility of an RDBMS-persisted config if you're willing to
build that module. (I found it straightforward.)

As far as scaling, you'll have to consider load metrics (what does 50
users actually imply for demand? what kinds of requests are they
making?), caching, hardware, backend etc. Simplistic JEE horizontal
scaling is a relatively expensive way to get results, and often flat-
out useless. Bryan's remarks are right on target.

---
A. Soroka
Digital Research and Scholarship R & D
the University of Virginia Library

On Oct 13, 2009, at 7:31 PM, bryanhall wrote:

jberti wrote:

Hi,
I have a couple questions about best practices or experience with
running
GeoServer in a production environment.
First, what is the best way to make sure every container has the same
configuration and data? Just modifying one and copying the data
over to
the other ones? Or is there any way to use the same data directory
for
all? (I think I've seen a discussion about this at some point but
couldn't
find it anymore).
Second, what are the experiences of how many different containers
to use
for GeoServer for lets say 15, 50 or 100 users at a time?

Julian,

I'm not sure what to answer on sharing the config - other than I'd
suggest
making the changes on a test site, then copying them to the
production site.

Load wise, it depends on what you are doing. If you are tiling
(google maps
/ virtual earth) and don't hit many cached tiles, you can drag down
the
server CPU wise rather quickly (it's just working really hard to
draw them
all). Same with WMS. With our rather complex shapes, and merged
30ish vector
layers of data, we get a draw rate of about 5 tiles/second per CPU.
Cached
tiles are almost free (just spool the file).

On the other hand, if you mostly do WFS or KML data streaming, it uses
almost no CPU time at all. Bandwidth and/or database lookup speed
are more
important here, IMHO.

We use GS on a 8-core server backed up by a 16-core database server.
We
currently provide WFS/WMS/WCS and KML support from our live
production data
using views. FME and our primary SaaS application shares the app
server as
well. Combined, we rarely touch more than 25% of the available CPU
power on
the box. Network bandwidth and then data gather time is more of the
bottleneck for us.

Bryan
--
View this message in context:
http://www.nabble.com/GeoServer-in-Production-environment-tp25879755p25882786.html
Sent from the GeoServer - User mailing list archive at Nabble.com.

------------------------------------------------------------------------------
Come build with us! The BlackBerry(R) Developer Conference in SF, CA
is the only developer event you need to attend this year. Jumpstart
your
developing skills, take BlackBerry mobile applications to market and
stay
ahead of the curve. Join us from November 9 - 12, 2009. Register now!
http://p.sf.net/sfu/devconference
_______________________________________________
Geoserver-users mailing list
Geoserver-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/geoserver-users

------------------------------------------------------------------------------
Come build with us! The BlackBerry(R) Developer Conference in SF, CA
is the only developer event you need to attend this year. Jumpstart your
developing skills, take BlackBerry mobile applications to market and stay
ahead of the curve. Join us from November 9 - 12, 2009. Register now!
http://p.sf.net/sfu/devconference
_______________________________________________
Geoserver-users mailing list
Geoserver-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/geoserver-users

--
View this message in context: http://www.nabble.com/GeoServer-in-Production-environment-tp25879755p26065904.html
Sent from the GeoServer - User mailing list archive at Nabble.com.

On 26/10/09 20:24, jberti wrote:

As our data is dynamic in nature, we choose not to cache it.

So dynamic it changes more often then, say, 10 seconds ? A 10-second cache is not be much, but it may mean three requests that get answered without processing.

Regards,

--------------------
    Luca Morandini
www.lucamorandini.it
--------------------

True, I totally forgot that you can set a time on how long you can cache and
thrown caching out as an option because I always thought that it would be a
long-term caching.
So yes, we could cache most of our data as some gets renewed once a week and
others daily. Of course we would have to set it up that it only gets cached
outside of the refresh times.
Would Geowebcache be the way to go for this caching or not?

Thanks,
Julian

Luca Morandini-2 wrote:

On 26/10/09 20:24, jberti wrote:

As our data is dynamic in nature, we choose not to cache it.

So dynamic it changes more often then, say, 10 seconds ? A 10-second cache
is not
be much, but it may mean three requests that get answered without
processing.

Regards,

--------------------
    Luca Morandini
www.lucamorandini.it
--------------------

------------------------------------------------------------------------------
Come build with us! The BlackBerry(R) Developer Conference in SF, CA
is the only developer event you need to attend this year. Jumpstart your
developing skills, take BlackBerry mobile applications to market and stay
ahead of the curve. Join us from November 9 - 12, 2009. Register now!
http://p.sf.net/sfu/devconference
_______________________________________________
Geoserver-users mailing list
Geoserver-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/geoserver-users

--
View this message in context: http://www.nabble.com/GeoServer-in-Production-environment-tp25879755p26067852.html
Sent from the GeoServer - User mailing list archive at Nabble.com.

On 26/10/09 22:20, jberti wrote:

So yes, we could cache most of our data as some gets renewed once a week and
others daily. Of course we would have to set it up that it only gets cached
outside of the refresh times.
Would Geowebcache be the way to go for this caching or not?

Simple caching can be accomplished by GeoServer layer configuration, but if you want pre-caching of all the tiles (provided you have a tiling client and a small number of different layers (or combination of layers)), GeoWebCache is the way to go.

Regards,

--------------------
    Luca Morandini
www.lucamorandini.it
--------------------

Luca Morandini wrote:

On 26/10/09 22:20, jberti wrote:
  

So yes, we could cache most of our data as some gets renewed once a week and
others daily. Of course we would have to set it up that it only gets cached
outside of the refresh times.
Would Geowebcache be the way to go for this caching or not?
    
Simple caching can be accomplished by GeoServer layer configuration,

Just to clarify: That caching is a header sent to the client, so it's client-side caching only. If you have many different users, it will not reduce the load significantly.

Preseeding is one thing, you can also just clear the cache when the data changes, and thereby probably get most of the benefit. Again, depends a bit on whether you have many users viewing the same tiles or just a few users requesting many different areas. The next version of GWC will tell you cache hit rates, so you'll get the numbers easily.

-Arne

On 26/10/09 22:01, Arne Kepp wrote:

Luca Morandini wrote:

Simple caching can be accomplished by GeoServer layer configuration,

Just to clarify: That caching is a header sent to the client, so it's
client-side caching only. If you have many different users, it will not
reduce the load significantly.

I stand corrected... well, one can still accomplish simple, server-side, caching using Squid (or similar).

Going back to the per-layer client-side caching... since most of the requests involve two or more layers, it may be useful to point out in the documentation that the max-age is equal to the min{max-age of every layer}, provided all of them have the response-header cache directive set, otherwise none will be set.

Shall I write a doc patch about this at http://docs.geoserver.org/trunk/en/user/webadmin/data/layers.html?highlight=cache ?

Regards,

--------------------
    Luca Morandini
www.lucamorandini.it
--------------------

Hello,
I was wondering if you could elaborate a bit more on the RDBMS-persisted
configuration. We are getting closer to putting Geoserver (2.0) into our
production environment. We currently have two distributed containers
(Tomcat via Apache) and find it difficult to maintain the configurations
between the two. It would be great to house that information in our RDBMS
(Oracle 10g) where it could be shared.
Any information you could provide would be great. Even where in the API to
start looking at.

Thanks,
Peter

ajs6f@anonymised.com wrote:

The 2.x series brings
the possibility of an RDBMS-persisted config if you're willing to
build that module. (I found it straightforward.)
---
A. Soroka
Digital Research and Scholarship R & D
the University of Virginia Library

--
View this message in context: http://old.nabble.com/GeoServer-in-Production-environment-tp25879755p26443619.html
Sent from the GeoServer - User mailing list archive at Nabble.com.

I was able to build the module and get an instance running with the information here:

http://geoserver.org/display/GEOS/Hibernate+based+catalog

We are currently testing to see whether it is yet stable enough for production work.

---
A. Soroka
Digital Research and Scholarship R & D
the University of Virginia Library

On Nov 23, 2009, at 2:27 PM, GeoSpidey wrote:

Hello,
I was wondering if you could elaborate a bit more on the RDBMS-persisted
configuration. We are getting closer to putting Geoserver (2.0) into our
production environment. We currently have two distributed containers
(Tomcat via Apache) and find it difficult to maintain the configurations
between the two. It would be great to house that information in our RDBMS
(Oracle 10g) where it could be shared.
Any information you could provide would be great. Even where in the API to
start looking at.

Thanks,
Peter

ajs6f@anonymised.com wrote:

The 2.x series brings
the possibility of an RDBMS-persisted config if you're willing to
build that module. (I found it straightforward.)
---
A. Soroka
Digital Research and Scholarship R & D
the University of Virginia Library

--
View this message in context: http://old.nabble.com/GeoServer-in-Production-environment-tp25879755p26443619.html
Sent from the GeoServer - User mailing list archive at Nabble.com.

------------------------------------------------------------------------------
Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day
trial. Simplify your report design, integration and deployment - and focus on
what you do best, core application coding. Discover what's new with
Crystal Reports now. http://p.sf.net/sfu/bobj-july
_______________________________________________
Geoserver-users mailing list
Geoserver-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/geoserver-users