[Geoserver-devel] monitoring community module

Hi all,

I would like permission to add a monitoring community module to svn. I have done some work documenting what exactly the module does and a bit about its design here:

http://geoserver.org/display/GEOS/Monitoring

The code I would like to commit was originally a prototype developed by Andrea that I have been trying to move forward.

Thanks!

-Justin


Justin Deoliveira
OpenGeo - http://opengeo.org
Enterprise support for open source geospatial.

Forgot to reply all.

On Aug 14, 2010, at 10:28 AM, Justin Deoliveira <jdeolive@anonymised.com> wrote:

Hi Andrea,

Thanks for the feedback, comments inline.

On Sat, Aug 14, 2010 at 2:06 AM, Andrea Aime <aaime@anonymised.com> wrote:

On Fri, Aug 13, 2010 at 5:02 PM, Justin Deoliveira <jdeolive@anonymised.com> wrote:

Hi all,
I would like permission to add a monitoring community module to svn. I have
done some work documenting what exactly the module does and a bit about its
design here:
http://geoserver.org/display/GEOS/Monitoring
The code I would like to commit was originally a prototype developed by
Andrea that I have been trying to move forward.

Hi there,
the development is sure exciting, I’m pretty sure everybody with a
GeoServer in production would like to know where the requests come
from, what are the most requests layers, and who’s the damn user that
is using 99.5% of your computing resources :-p (especially the
latter!).

Some comments about the wiki page:

  • the “response” table has a “start time - The time stamp the response
    to the request started.”. Isn’t that already part of the request
    stats?

Yeah, when thinking about the model originally i thought it might be nice to track when the response started vs when the request started… maybe to weed out those requests that are taking a long time to process vs long time to encode the response. But yeah, not sure how useful that will be. I have not added that to the module yet. It is implemented as it originally was, request start time and total time

  • when you say “http interface” you mean a REST one (RESTful or REST like)?

Yes :slight_smile: I dare not call it a “REST interface” and risk getting called out for misusing the term.

One thing I find is missing is how to deal with the statistics
gathering overhead.
During the workflow there are three distintinct points that take quite
a bit of time to be carried out:

  • reverse DNS lookup of the client IP (might even time out, which
    would take various seconds to happen)
  • saving/updating the request information on a database (especially if
    it’s sitting on another machine/is busy, which is likely the case in a
    cluster setup)
  • gathering the geo-location from the IP

All of these should be performed in a separate thread, other than the
request one, to make sure the server still have resources to respond.
Even doing them at the end of the request cycle would not cut it
because that would keep the thread from responding to other requests
and could badlly interact with the control-flow extension making the
latter think the request is still in progress.

Yeah, this is one thing I have added to the original code. Basically the monitor maintains a ThreadPoolExectorService to be used for post processing operations such as this. Seems to work well enough. What are your thoughts on fixed size thread pool vs dynamic/cached one?

A system with the requests threads feeding into a queue and a thread
group reading from the queue and performing the long operations is
probably better (with more than one thread doing the long operations
since those often involve network communications)

Another thing I have added and am experimenting with is doing all database transactions in a separate thread as well. However in order to do this some serialization is required which you don’t really want to mix up database transactions that should be executed serially. So I added (actually looked it up online and found someone with the same problem ) a sort of pipelining executor service. Basically it is a map where the key is some id (in this case the request id) and the values are synchronized queues of tasks to perform. Then a single thread continually scans the map for jobs to execute. When it finds one it delegates it out to a thread pool executor service. When that task is complete the next job in that same queue can execute.

Anyways, this is something i implemented only for a specific monitor dao (not the hibernate one), so it is easily disabled.

My 2 cents

Cheers
Andrea

PS: oh, btw, +1 on adding the community module of course


Justin Deoliveira
OpenGeo - http://opengeo.org
Enterprise support for open source geospatial.

Justin Deoliveira ha scritto:

    Some comments about the wiki page:
    - the "response" table has a "start time - The time stamp the response
    to the request started.". Isn't that already part of the request
    stats?

Yeah, when thinking about the model originally i thought it might be nice to track when the response started vs when the request started... maybe to weed out those requests that are taking a long time to process vs long time to encode the response. But yeah, not sure how useful that will be. I have not added that to the module yet. It is implemented as it originally was, request start time and total time

Aah, yeah, it actually makes a lot of sense. Having the subdivision
between pre-processing time and response time is actually quite interesting.
It won't tell the full story since by streaming we overlap network
communication with actual work, but I guess we'll find something
interesting by looking at those stats.

    - when you say "http interface" you mean a REST one (RESTful or
    REST like)?

Yes :slight_smile: I dare not call it a "REST interface" and risk getting called out for misusing the term.

Why? That never happened to me! :-p
Ah, some people cannot tell apart technology and religion...

    One thing I find is missing is how to deal with the statistics
    gathering overhead.
    During the workflow there are three distintinct points that take quite
    a bit of time to be carried out:
    - reverse DNS lookup of the client IP (might even time out, which
    would take various seconds to happen)
    - saving/updating the request information on a database (especially if
    it's sitting on another machine/is busy, which is likely the case in a
    cluster setup)
    - gathering the geo-location from the IP

    All of these should be performed in a separate thread, other than the
    request one, to make sure the server still have resources to respond.
    Even doing them at the end of the request cycle would not cut it
    because that would keep the thread from responding to other requests
    and could badlly interact with the control-flow extension making the
    latter think the request is still in progress.

Yeah, this is one thing I have added to the original code. Basically the monitor maintains a ThreadPoolExectorService to be used for post processing operations such as this. Seems to work well enough. What are your thoughts on fixed size thread pool vs dynamic/cached one?

Ha, a difficult call. And while you're at it, put a concurrent queue
in the middle between the threads that serve the requests and the
ones that do save the updates.
When it comes to saving data to the database, I'd actually try to
slurp up all the transaction in the queue and build a bigger transaction
that is then sent to the dbms in one shot: batching allows to reduce the
network delay overhead. And in case of high traffic it provides and occasion to squash two updates related to the same request (in case
we dbms is slow responding they will pile up).
So let's say that for that we could use just one thread.
The geoip thing... can we batch it as well?

And then there is the blocking queue. Fixed size or unlimited? The first
will eventually stop the threads serving the requests, but the
latter may eventually result in OOM if the requests are going
on faster than what the centralized can receive them...
I'd say limited queue with a big-ish max size, 1000 or 10k.
The objects we queue there are small, right?

    A system with the requests threads feeding into a queue and a thread
    group reading from the queue and performing the long operations is
    probably better (with more than one thread doing the long operations
    since those often involve network communications)

Another thing I have added and am experimenting with is doing all database transactions in a separate thread as well. However in order to do this some serialization is required which you don't really want to mix up database transactions that should be executed serially. So I added (actually looked it up online and found someone with the same problem ) a sort of pipelining executor service. Basically it is a map where the key is some id (in this case the request id) and the values are synchronized queues of tasks to perform. Then a single thread continually scans the map for jobs to execute. When it finds one it delegates it out to a thread pool executor service. When that task is complete the next job in that same queue can execute.

Anyways, this is something i implemented only for a specific monitor dao (not the hibernate one), so it is easily disabled.

Sounds good. I'd add batching to the mix, when inserting/updating
at high rate performing changing in batches does wonders

Cheers
Andrea

--
Andrea Aime
OpenGeo - http://opengeo.org
Expert service straight from the developers.

+1.

On 13/08/10 23:02, Justin Deoliveira wrote:

Hi all,

I would like permission to add a monitoring community module to svn. I have done some work documenting what exactly the module does and a bit about its design here:

http://geoserver.org/display/GEOS/Monitoring

The code I would like to commit was originally a prototype developed by Andrea that I have been trying to move forward.

Thanks!

-Justin

--
Justin Deoliveira
OpenGeo - http://opengeo.org
Enterprise support for open source geospatial.

--
Ben Caradoc-Davies <Ben.Caradoc-Davies@anonymised.com>
Software Engineering Team Leader
CSIRO Earth Science and Resource Engineering
Australian Resources Research Centre