Hi,
lately I've been working on the idea of controlling how many
requests of a given type can be performed in parallel.
This is driven by a number of concerns.
First off, I don't want a GeoServer to starting throwing OOM
like mad because it's trying to serve too many GetMap in parallel.
The WMS limits already allow an admin to control how much memory
a single WMS request is going to use, but while that prevents a single
request from eating all the memory, it still does not prevent
exhaustion from too many requests.
A second case I want to handle is something I've discovered while
playing with OpenLayers tiled demos. Just open the preview, switch
to tiled mode, make the map size quite large, and start zooming
very fast up and down, using the mouse scroller or just clicking
on the zoom bar very fast.
What happens is that Firefox does right away 6 requests in parallel
against GeoServer, then you change zoom, it drops the old requests
and makes another 6, and so on every time you change the current
zoom. The fact that firefox drops the request is not getting notified
on the server side until we try to write out to the output, which
might happen after quite some time.
With some instrumentation in the Dispatcher I've observed 50+
requests rendering in parallel generated by a single client.
That is of course unacceptable, a single user that way can really
bring a small server to its knees, we want to make sure a single client
cannot make more than X requests in parallel and have the others
refused or queued*.
Yet another reason to control the incoming requests is pure
performance. It has been noticed during the FOSS4G benchmarks that
limiting the number of parallel GetMap requests that a server
is actually working on to something like 2*NumCpu helps throughput,
in some cases significantly so.
We can do that by making the web container a thread pool with a
certain max serving threads, but that has some significant
disadvantages:
- it affects all applications running in the container
- it affects the GUI. Try to limit the amount of parallel requests
allowed to 4 and Firefox will be none too happy about it
(I've experienced issues loading the GUI pages).
- some requests can scale up much more because they are very light
so they should not be limited to 2*NumCpu threads. Think
capabilities, GetFeatureInfo or just GetFeature, which usually
is streaming and thus bandwidth limited as opposed to cpu limited
(if you're serving towards the internet).
Long story short, I've created a pluggable module, leveraring DispatcherCallback, that allows to control the flow of incoming
requests based on a single property file.
A single configuration file like this one:
# request timeout in seconds
timeout=10
# no more than 100 parallel requests total
ows.global=100
# no more than 16 getmap in parallel, total
ows.wms.getmap=16
# don't give the single user more than 6 requests total in parallel
# (this is what a browser will do by default)
user=6
can be used to make all of the problems cited above melt like snow
in the sun. It will make sure GetMap requests won't overwhelm the
server, that a single user cannot monopolize the server, and that
requests hanging in queue waiting to be executed for more than
10 seconds just get dropped.
It will still allow plenty of GetFeatureInfo to be executed in
parallel and won't affect GUI related threads at all.
The design is based on blocking queues and tokens, benchmarks
show that it does not significantly affect performance.
Soo... ok to commit? I am PSC and I could give myself the +1,
but that would not be too nice
I'll put togheter a page describing the design and usage of the
module after committing it.
Cheers
Andrea
*: it would also be very nice to get notified that the client
dropped the connection, but I've so far found no easy way
to do that. But I'm still trying to work it out, in my spare time,
leveraging the comet support that some web containers have added
lately: http://n2.nabble.com/Checking-when-the-client-dropped-the-connection-td4198235.html#a4198235
--
Andrea Aime
OpenGeo - http://opengeo.org
Enterprise support for open source geospatial.