[Geoserver-devel] Request IDs

Hi folks,

Just exploring the best way to go about this. We use a X-Request-ID header for tracing and logging requests through our application stack (from CDN/load balancers through to databases) and I’m exploring adding support this to GeoServer. The request ID values are UUIDs.

ows.Request has a protected UUID identifier property, which doesn’t seem to be used for much. So, I’m thinking:

  • Alter the Request class to make it settable

  • Set it in a custom early Dispatcher init() callback from the header if present – it would miss some of the request parsing but that’s not the end of the world.

  • Add it to the logging MDC context in the same custom callback

  • Use it in the logging output format from the MDC context so all log entries associated with that request are tagged with the ID

  • Alter the monitor extension so that the audit log has access to Request.identifier, so the audit log can contain the ID too

  • Get it through to the SQL session start/stop scripts so they can set it in postgres (via “SET application_name”) – maybe via EnvFunction a bit like GSUSER.

  • Postgres can log it for any SQL errors/slow-query traces/locks/etc

  1. Obvious flaws? Am I missing an easier approach?
  2. Would the changes to Request and the Monitor extension likely be accepted as a contribution?
  3. Is this behaviour that would be useful for others? If so, potentially a config option that enables extracting a value from a defined header and adding it to the logging context/environment early in Dispatcher itself.

Thanks,

Rob :slight_smile:

Hi Robert,
sorry for the late reply, busy times, could not answer right away and then lost it in the build
server failures storm that we have been into lately

···

On Tue, Mar 8, 2016 at 8:32 PM, Robert Coup <robert.coup@anonymised.com> wrote:

Hi folks,

Just exploring the best way to go about this. We use a X-Request-ID header for tracing and logging requests through our application stack (from CDN/load balancers through to databases) and I’m exploring adding support this to GeoServer. The request ID values are UUIDs.

ows.Request has a protected UUID identifier property, which doesn’t seem to be used for much. So, I’m thinking:

  • Alter the Request class to make it settable

Roger that… maybe with a system variable too? Like we would not want a malicious user to pass the right header in a server that’s not fronted by something else
setting the x-request-id

  • Set it in a custom early Dispatcher init() callback from the header if present – it would miss some of the request parsing but that’s not the end of the world.

Or the dispatcher itself could do it.

  • Add it to the logging MDC context in the same custom callback

MDC… ? Ah: https://logging.apache.org/log4j/1.2/apidocs/org/apache/log4j/MDC.html

  • Use it in the logging output format from the MDC context so all log entries associated with that request are tagged with the ID

Yes please, being able to tell apart logs from different requests would be great

  • Alter the monitor extension so that the audit log has access to Request.identifier, so the audit log can contain the ID too

Yep

  • Get it through to the SQL session start/stop scripts so they can set it in postgres (via “SET application_name”) – maybe via EnvFunction a bit like GSUSER.

Ok, why not

  • Postgres can log it for any SQL errors/slow-query traces/locks/etc
  1. Obvious flaws? Am I missing an easier approach?

You should also create a ThreadLocalTransfer object so that the request id gets preserved for requests triggering action
inside a thread pool, e.g., asynchronous WPS requests

  1. Would the changes to Request and the Monitor extension likely be accepted as a contribution?

I’d say so, yes

  1. Is this behaviour that would be useful for others? If so, potentially a config option that enables extracting a value from a defined header and adding it to the logging context/environment early in Dispatcher itself.

I’d say it would be useful to everybody.

One small extra note, it could also be useful to have also have the request id in the response header, so that
if I have a problematic request in the client, I can go and look for the id and then grep the logs for just the log
lines pertaining to it.

Cheers
Andrea

==
GeoServer Professional Services from the experts! Visit
http://goo.gl/it488V for more information.

Ing. Andrea Aime

@geowolf
Technical Lead

GeoSolutions S.A.S.
Via di Montramito 3/A
55054 Massarosa (LU)
phone: +39 0584 962313

fax: +39 0584 1660272
mob: +39 339 8844549

http://www.geo-solutions.it
http://twitter.com/geosolutions_it

AVVERTENZE AI SENSI DEL D.Lgs. 196/2003

Le informazioni contenute in questo messaggio di posta elettronica e/o nel/i file/s allegato/i sono da considerarsi strettamente riservate. Il loro utilizzo è consentito esclusivamente al destinatario del messaggio, per le finalità indicate nel messaggio stesso. Qualora riceviate questo messaggio senza esserne il destinatario, Vi preghiamo cortesemente di darcene notizia via e-mail e di procedere alla distruzione del messaggio stesso, cancellandolo dal Vostro sistema. Conservare il messaggio stesso, divulgarlo anche in parte, distribuirlo ad altri soggetti, copiarlo, od utilizzarlo per finalità diverse, costituisce comportamento contrario ai principi dettati dal D.Lgs. 196/2003.

The information in this message and/or attachments, is intended solely for the attention and use of the named addressee(s) and may be confidential or proprietary in nature or covered by the provisions of privacy act (Legislative Decree June, 30 2003, no.196 - Italy’s New Data Protection Code).Any use not in accord with its purpose, any disclosure, reproduction, copying, distribution, or either dissemination, either whole or partial, is strictly forbidden except previous formal approval of the named addressee(s). If you are not the intended recipient, please contact immediately the sender by telephone, fax or e-mail and delete the information in this message that has been received in error. The sender does not give any warranty or accept liability as the content, accuracy or completeness of sent messages and accepts no responsibility for changes made after they were sent or for other risks which arise as a result of e-mail transmission, viruses, etc.


Hi Andrea,

···

On 18 March 2016 at 10:22, Andrea Aime <andrea.aime@anonymised.com1268…> wrote:

Hi Robert,
sorry for the late reply, busy times, could not answer right away and then lost it in the build
server failures storm that we have been into lately

No worries, I completely understand :slight_smile:

Yes.

Well, if it’s generally useful I could simplify the approach a bit:

  • Add a global config option with a header to look at for inbound requests (ie. X-Request-ID or whatever). For Geoserver instances without a reverse proxy generating UUIDs sitting in front, they would just create a UUID for each request like precent.
  • Always add the request ID to the logging/monitor/sql contexts (and pass it through to WPS/etc as discussed below), and have the Dispatcher set it up initially.
  • Always return the request ID as a response header (either using the configured header name, or defaulting to GeoServer-Request-ID).
    Always returning a header would add 60 bytes (using the above name) to every response… how do people feel about that?

Personally I think OGC service responses are verbose enough that it’d be unlikely to make any difference to users, and the ability to trace requests and debug issues quickly and effectively massively trumps that overhead. But most admins won’t use it most of the time, so I’m interested in others views. We could always make it disable-able if only a few admins might want to optimise and are prepared to run with a custom applicationContext config.

Presumably changing the default behaviour and adding a new config setting would need a GSIP?

Yeah… I’m a little fuzzy on exactly which logging system GeoServer uses (I don’t spend enough time in Java), but the docs says it uses java.util.logging but has log4j underneath (so MDC is available)? slf4j looks like it wraps MDC for java.util.logging (http://www.slf4j.org/api/org/slf4j/MDC.html) and that seems to be in the POMs too… so I think I should be able to get it working somehow ¯_(ツ)_/¯

Sounds good, I hadn’t thought about WPS. From my quick reading of the docs/code:

  • &request=Execute will start processing, and creates an ExecutionID
  • If the RequestID from the Execute request is attached to the background processing via ThreadLocalTransfer & potentially via the Hazelcast clustering, logs/etc can keep being associated with the initial Execute user request if that’s useful
  • For Async requests any Dismiss/Status/Result requests require the ExecutionID to be passed in. They’ll each have their own RequestIDs as well.
  • WPS ExecutionIDs should probably be put into the MDC/monitor/SQL/etc contexts too if users want to track/log/use them for similar reasons.

Thinking further, it would also be good to pass request IDs to any upstream cascaded WFS/WMS servers too, continuing the goal that a single end-user request can be traced/followed as far as possible. Though I haven’t looked at how yet :slight_smile:

Cheers,

Rob :slight_smile:

  • Alter the Request class to make it settable

Roger that… maybe with a system variable too? Like we would not want a malicious user to pass the right header in a server that’s not fronted by something else
setting the x-request-id

  • Set it in a custom early Dispatcher init() callback from the header if present – it would miss some of the request parsing but that’s not the end of the world.

Or the dispatcher itself could do it.

One small extra note, it could also be useful to have also have the request id in the response header, so that
if I have a problematic request in the client, I can go and look for the id and then grep the logs for just the log
lines pertaining to it.

  • Add it to the logging MDC context in the same custom callback

MDC… ? Ah: https://logging.apache.org/log4j/1.2/apidocs/org/apache/log4j/MDC.html

  1. Obvious flaws? Am I missing an easier approach?

You should also create a ThreadLocalTransfer object so that the request id gets preserved for requests triggering action
inside a thread pool, e.g., asynchronous WPS requests

On Fri, Mar 18, 2016 at 12:53 PM, Robert Coup <robert.coup@anonymised.com>
wrote:

One small extra note, it could also be useful to have also have the

request id in the response header, so that
if I have a problematic request in the client, I can go and look for the
id and then grep the logs for just the log
lines pertaining to it.

Well, if it's generally useful I could simplify the approach a bit:

   - Add a global config option with a header to look at for inbound
   requests (ie. X-Request-ID or whatever). For Geoserver instances without a
   reverse proxy generating UUIDs sitting in front, they would just create a
   UUID for each request like precent.
   - *Always* add the request ID to the logging/monitor/sql contexts (and
   pass it through to WPS/etc as discussed below), and have the Dispatcher set
   it up initially.
   - *Always* return the request ID as a response header (either using
   the configured header name, or defaulting to GeoServer-Request-ID).

Always returning a header would add 60 bytes (using the above name) to
every response... how do people feel about that?

I'm not particularly worried about it... just look at how many headers GWC
sets on returned tiles.

*Personally* I think OGC service responses are verbose enough that it'd
be unlikely to make any difference to users, and the ability to trace
requests and debug issues quickly and effectively massively trumps that
overhead. But most admins won't use it most of the time, so I'm interested
in others views. We could always make it disable-able if only a few admins
might want to optimise and are prepared to run with a custom
applicationContext config.

Admins rarely touch app context, but are normally happy to play with system
variables. GeoServerExtensions.getProperty allows to pick a variable from
env, system, and servlet context, which allows a range of options.

Presumably changing the default behaviour and adding a new config setting
would need a GSIP?

Hum... it's unclear to me if a GSIP is needed for this, or not, since no
programming interface or user visible change is performed but... I guess it
would not hurt either?
Personally I'm happy with some conversation on the ml, let's hear what the
others think.

   - Add it to the logging MDC context in the same custom callback

MDC... ? Ah:

https://logging.apache.org/log4j/1.2/apidocs/org/apache/log4j/MDC.html

Yeah... I'm a little fuzzy on exactly which logging system GeoServer uses
(I don't spend enough time in Java), but the docs says it uses
java.util.logging but has log4j underneath (so MDC is available)? slf4j
looks like it wraps MDC for java.util.logging (
http://www.slf4j.org/api/org/slf4j/MDC.html) and that seems to be in the
POMs too... so I think I should be able to get it working somehow
¯\_(ツ)_/¯

slf4j is not really used, it's there because some libs we depend onto are
using it.
GeoServer by default uses a java logging to log4j bridge coming from
GeoTools, but there is a system var
one can set to just use java logging instead.
See:
https://github.com/geoserver/geoserver/blob/master/src/main/src/main/java/org/geoserver/logging/LoggingInitializer.java
https://github.com/geoserver/geoserver/blob/master/src/main/src/main/java/org/geoserver/logging/LoggingUtils.java
and:
https://github.com/geotools/geotools/tree/master/modules/library/metadata/src/main/java/org/geotools/util/logging

Sounds good, I hadn't thought about WPS. From my quick reading of the

docs/code:

   - &request=Execute will start processing, and creates an ExecutionID
   - If the RequestID from the Execute request is attached to the
   background processing via ThreadLocalTransfer & potentially via the
   Hazelcast clustering, logs/etc can keep being associated with the initial
   Execute user request if that's useful
   - For Async requests any Dismiss/Status/Result requests require the
   ExecutionID to be passed in. They'll each have their own RequestIDs as well.
   - WPS ExecutionIDs should probably be put into the MDC/monitor/SQL/etc
   contexts too if users want to track/log/use them for similar reasons.

Thinking further, it would also be good to pass request IDs to any
upstream cascaded WFS/WMS servers too, continuing the goal that a single
end-user request can be traced/followed as far as possible. Though I
haven't looked at *how* yet :slight_smile:

Yes, that's something I've seen doing in service oriented architectures, so
that a single id is passed around across services.
Doing so might not be so easy, but have a look at
https://github.com/geotools/geotools/blob/master/modules/library/main/src/main/java/org/geotools/data/ows/HTTPClient.java

its implementations and how they are used by the WMS client, and... maybe
by the wfs client too (don't really know what the
wfs-ng store is doing internally, you'll have to check)

Cheers
Andrea

--

GeoServer Professional Services from the experts! Visit
http://goo.gl/it488V for more information.

Ing. Andrea Aime
@geowolf
Technical Lead

GeoSolutions S.A.S.
Via di Montramito 3/A
55054 Massarosa (LU)
phone: +39 0584 962313
fax: +39 0584 1660272
mob: +39 339 8844549

http://www.geo-solutions.it
http://twitter.com/geosolutions_it

*AVVERTENZE AI SENSI DEL D.Lgs. 196/2003*

Le informazioni contenute in questo messaggio di posta elettronica e/o
nel/i file/s allegato/i sono da considerarsi strettamente riservate. Il
loro utilizzo è consentito esclusivamente al destinatario del messaggio,
per le finalità indicate nel messaggio stesso. Qualora riceviate questo
messaggio senza esserne il destinatario, Vi preghiamo cortesemente di
darcene notizia via e-mail e di procedere alla distruzione del messaggio
stesso, cancellandolo dal Vostro sistema. Conservare il messaggio stesso,
divulgarlo anche in parte, distribuirlo ad altri soggetti, copiarlo, od
utilizzarlo per finalità diverse, costituisce comportamento contrario ai
principi dettati dal D.Lgs. 196/2003.

The information in this message and/or attachments, is intended solely for
the attention and use of the named addressee(s) and may be confidential or
proprietary in nature or covered by the provisions of privacy act
(Legislative Decree June, 30 2003, no.196 - Italy's New Data Protection
Code).Any use not in accord with its purpose, any disclosure, reproduction,
copying, distribution, or either dissemination, either whole or partial, is
strictly forbidden except previous formal approval of the named
addressee(s). If you are not the intended recipient, please contact
immediately the sender by telephone, fax or e-mail and delete the
information in this message that has been received in error. The sender
does not give any warranty or accept liability as the content, accuracy or
completeness of sent messages and accepts no responsibility for changes
made after they were sent or for other risks which arise as a result of
e-mail transmission, viruses, etc.

-------------------------------------------------------

Hi,

For our application stack, we use Zipkin with a backing Cassandra store in a Docker environment. This also uses a custom httpclient that is integrated into the various levels of the app stack and service implementations. Zipkin adds a custom header that is tracked through the calls.

Chris Snider

Senior Software Engineer

Intelligent Software Solutions, Inc.

Description: Description: Description: cid:image001.png@...3926...

(attachments)

image001.png

···

On Fri, Mar 18, 2016 at 12:53 PM, Robert Coup <robert.coup@…1454…> wrote:

One small extra note, it could also be useful to have also have the request id in the response header, so that

if I have a problematic request in the client, I can go and look for the id and then grep the logs for just the log

lines pertaining to it.

Well, if it’s generally useful I could simplify the approach a bit:

  • Add a global config option with a header to look at for inbound requests (ie. X-Request-ID or whatever). For Geoserver instances without a reverse proxy generating UUIDs sitting in front, they would just create a UUID for each request like precent.
  • Always add the request ID to the logging/monitor/sql contexts (and pass it through to WPS/etc as discussed below), and have the Dispatcher set it up initially.
  • Always return the request ID as a response header (either using the configured header name, or defaulting to GeoServer-Request-ID).

Always returning a header would add 60 bytes (using the above name) to every response… how do people feel about that?

I’m not particularly worried about it… just look at how many headers GWC sets on returned tiles.

Personally I think OGC service responses are verbose enough that it’d be unlikely to make any difference to users, and the ability to trace requests and debug issues quickly and effectively massively trumps that overhead. But most admins won’t use it most of the time, so I’m interested in others views. We could always make it disable-able if only a few admins might want to optimise and are prepared to run with a custom applicationContext config.

Admins rarely touch app context, but are normally happy to play with system variables. GeoServerExtensions.getProperty allows to pick a variable from env, system, and servlet context, which allows a range of options.

Presumably changing the default behaviour and adding a new config setting would need a GSIP?

Hum… it’s unclear to me if a GSIP is needed for this, or not, since no programming interface or user visible change is performed but… I guess it would not hurt either?

Personally I’m happy with some conversation on the ml, let’s hear what the others think.

  • Add it to the logging MDC context in the same custom callback

MDC… ? Ah: https://logging.apache.org/log4j/1.2/apidocs/org/apache/log4j/MDC.html

Yeah… I’m a little fuzzy on exactly which logging system GeoServer uses (I don’t spend enough time in Java), but the docs says it uses java.util.logging but has log4j underneath (so MDC is available)? slf4j looks like it wraps MDC for java.util.logging (http://www.slf4j.org/api/org/slf4j/MDC.html) and that seems to be in the POMs too… so I think I should be able to get it working somehow ¯_(ツ)_/¯

slf4j is not really used, it’s there because some libs we depend onto are using it.

GeoServer by default uses a java logging to log4j bridge coming from GeoTools, but there is a system var

one can set to just use java logging instead.

See:

https://github.com/geoserver/geoserver/blob/master/src/main/src/main/java/org/geoserver/logging/LoggingInitializer.java

https://github.com/geoserver/geoserver/blob/master/src/main/src/main/java/org/geoserver/logging/LoggingUtils.java

and:

https://github.com/geotools/geotools/tree/master/modules/library/metadata/src/main/java/org/geotools/util/logging

Sounds good, I hadn’t thought about WPS. From my quick reading of the docs/code:

  • &request=Execute will start processing, and creates an ExecutionID
  • If the RequestID from the Execute request is attached to the background processing via ThreadLocalTransfer & potentially via the Hazelcast clustering, logs/etc can keep being associated with the initial Execute user request if that’s useful
  • For Async requests any Dismiss/Status/Result requests require the ExecutionID to be passed in. They’ll each have their own RequestIDs as well.
  • WPS ExecutionIDs should probably be put into the MDC/monitor/SQL/etc contexts too if users want to track/log/use them for similar reasons.

Thinking further, it would also be good to pass request IDs to any upstream cascaded WFS/WMS servers too, continuing the goal that a single end-user request can be traced/followed as far as possible. Though I haven’t looked at how yet :slight_smile:

Yes, that’s something I’ve seen doing in service oriented architectures, so that a single id is passed around across services.

Doing so might not be so easy, but have a look at

https://github.com/geotools/geotools/blob/master/modules/library/main/src/main/java/org/geotools/data/ows/HTTPClient.java

its implementations and how they are used by the WMS client, and… maybe by the wfs client too (don’t really know what the

wfs-ng store is doing internally, you’ll have to check)

Cheers

Andrea

==

GeoServer Professional Services from the experts! Visit

http://goo.gl/it488V for more information.

==

Ing. Andrea Aime

@geowolf

Technical Lead

GeoSolutions S.A.S.
Via di Montramito 3/A
55054 Massarosa (LU)

phone: +39 0584 962313

fax: +39 0584 1660272

mob: +39 339 8844549

http://www.geo-solutions.it

http://twitter.com/geosolutions_it

AVVERTENZE AI SENSI DEL D.Lgs. 196/2003

Le informazioni contenute in questo messaggio di posta elettronica e/o nel/i file/s allegato/i sono da considerarsi strettamente riservate. Il loro utilizzo è consentito esclusivamente al destinatario del messaggio, per le finalità indicate nel messaggio stesso. Qualora riceviate questo messaggio senza esserne il destinatario, Vi preghiamo cortesemente di darcene notizia via e-mail e di procedere alla distruzione del messaggio stesso, cancellandolo dal Vostro sistema. Conservare il messaggio stesso, divulgarlo anche in parte, distribuirlo ad altri soggetti, copiarlo, od utilizzarlo per finalità diverse, costituisce comportamento contrario ai principi dettati dal D.Lgs. 196/2003.

The information in this message and/or attachments, is intended solely for the attention and use of the named addressee(s) and may be confidential or proprietary in nature or covered by the provisions of privacy act (Legislative Decree June, 30 2003, no.196 - Italy’s New Data Protection Code).Any use not in accord with its purpose, any disclosure, reproduction, copying, distribution, or either dissemination, either whole or partial, is strictly forbidden except previous formal approval of the named addressee(s). If you are not the intended recipient, please contact immediately the sender by telephone, fax or e-mail and delete the information in this message that has been received in error. The sender does not give any warranty or accept liability as the content, accuracy or completeness of sent messages and accepts no responsibility for changes made after they were sent or for other risks which arise as a result of e-mail transmission, viruses, etc.


Hi Chris,

···

On 18 March 2016 at 14:35, Chris Snider <chris.snider@anonymised.com65…> wrote:

For our application stack, we use Zipkin with a backing Cassandra store in a Docker environment. This also uses a custom httpclient that is integrated into the various levels of the app stack and service implementations. Zipkin adds a custom header that is tracked through the calls.

Looks like Zipkin is a more advanced approach along the same lines. From what I can tell from a quick read it uses 3-5 headers?

X-B3-TraceId: overall trace id

X-B3-SpanId: id for current step/component in a graph of services

X-B3-ParentSpanId: SpanId from any parent component/step

X-B3-Flags & X-B3-Sampled: track “debug” requests and whether additional performance sampling is requested

Each ID is 64bits hex-encoded (ie. half the length of a UUID)

Does that sound about right?

I’m not overly keen to add such a complex implementation – would what’s currently be proposed be useful for you? Do your team find the ability to trace requests helpful?

https://brandur.org/request-ids discusses a similar pattern as I’ve proposed, though each step (eg. frontend, GeoServer, WPS task, cascaded WFS, database) would all generate their own UUIDs then append them together.

Cheers,

Rob :slight_smile:

Hi Robert,

For myself, I didn’t use the zipkin myself too much, although our testers and infrastructure teams used the traces heavily while doing system and integration testing. These traces helped find some performance issues between web front ends and various services tracing the amount of time each level of a request required. The extra level of span traces were particularly useful to one of the teams tracking down a slow-down with one of the backend services

Request tracing would be a boon to anyone trying to find issues with a system. Linking the full trace with an ID would make issue and performance tracking much easier.

Regarding the number of headers in zipkin, I didn’t work on that aspect so I can’t speak confidently on the number and type. I only had to change my Spring configuration to pull in the definitions defined by one of the other teams.

Chris Snider

Senior Software Engineer

Intelligent Software Solutions, Inc.

Description: Description: Description: cid:image001.png@...3926...

(attachments)

image001.png

···

Hi Chris,

On 18 March 2016 at 14:35, Chris Snider <chris.snider@…2565…> wrote:

For our application stack, we use Zipkin with a backing Cassandra store in a Docker environment. This also uses a custom httpclient that is integrated into the various levels of the app stack and service implementations. Zipkin adds a custom header that is tracked through the calls.

Looks like Zipkin is a more advanced approach along the same lines. From what I can tell from a quick read it uses 3-5 headers?

X-B3-TraceId: overall trace id

X-B3-SpanId: id for current step/component in a graph of services

X-B3-ParentSpanId: SpanId from any parent component/step

X-B3-Flags & X-B3-Sampled: track “debug” requests and whether additional performance sampling is requested

Each ID is 64bits hex-encoded (ie. half the length of a UUID)

Does that sound about right?

I’m not overly keen to add such a complex implementation – would what’s currently be proposed be useful for you? Do your team find the ability to trace requests helpful?

https://brandur.org/request-ids discusses a similar pattern as I’ve proposed, though each step (eg. frontend, GeoServer, WPS task, cascaded WFS, database) would all generate their own UUIDs then append them together.

Cheers,

Rob :slight_smile: