[Geoserver-devel] Some consideration on GeoServer logging

Hi,
this morning I took the time to analyze what's going on
with GeoServer logging.

On startup, installs the gt2 CommonHandler on
org.geotools, org.geoserver and org.vfny.geoserver
(the root packages of GeoTools and GeoServer, respectively).

Now, this alone is not sufficient, and by a long stretch,
because it does not alter the logging levels.
In GeoServer 1.6 we have the log configured by a log4j file,
which may look like the following:
https://svn.codehaus.org/geoserver/trunk/geoserver/web/src/main/java/GEOSERVER_DEVELOPER_LOGGING.properties

The above file is usually held in the data dir, where the users
can modify it, and configures the log4j handlers and the log4j
logging levels. Please note how different packages receive
different logging levels.

Now, the problem is how to transport the per package logging level settings from that property file to the java loggers and... here
that's where we are screwed.

Now, first of all, why do we need to change the java logging level?
Performance. Theoretically one could have java logging log everything
and have log4j deal with the loggability of each message at the
handler level... too bad this would literally make GeoServer 4 times
slower (see the FOSS4G presentation "MapServer vs GeoServer" to
see how much logging may impact rendering performance).

The current code in GeoServer.configureGeoServerLogging tries
the following:
* parse the log4j file
* find out every java logger by calling LogManager.getLogManager().getLoggerNames()
* for each of them, state a new logging level (using the
   logging level found in the log4j file for that package
   or for the nearest containing package)

This is very considerate, in that the code will touch only
the loggers for which we know about in the log4j package,
leaving every other logger untouched -> it will impact
only the other web applications that use geotools and
geoserver packages.

Unfortunately it does not work. The problem is that
LogManager.getLogManager().getLoggerNames() will return
only the loggers that have been registered so far.
So if, for example, at that moment no rendering occured,
the "org.geotools.renderer" logger is not there, and its
logging level won't be configured.
When that logger gets created (first time StreamingRenderer
is referenced), LogManager will configure it using the
directives contained in the java logging property file,
which in my case say INFO.

Double checking is easy. Set the logging level to
GEOTOOLS_DEVELOPER_LOGGING (everything in org.geotools
should log to FINE level, more on this later*),
start geoserver, have something rendered.... the log
shows nothing (because the rendering package is still
configured at INFO level, see above).
Now go to the geoserver configuration, press apply,
this will force GeoServer.configureGeoServerLogging
to run once more, and this time it'll find "org.geotools.renderer"
and configure it to FINE level.
Do a WMS request again, boom, lots of logs in the output.

In order to have java logging use the log level we want
we would have to make up a java logging configuration file
that corresponds to the log4j file, stating the appropriate
log levels. Now, this has its own fair share of issues:
* we would have to make up a full log file, so we would have
   to configure the handlers too. If an admin has configured
   java loggin in any way we would be overriding its work,
   unless we can access the java logging property file and
   parse it so that we keep whatever is external to
   geotools/geoserver in place. We may not be able to do so
   thought, because our code is not as trusted as the one
   in the jvm itself (think running in a security manager)
* calling LogManager.readConfiguration(inputStream) will
   reconfigure loggin for _every_ other web application
   running in the same container. That's a big no no for
   every web app (you should not mess with other application
   state)

To avoid this mess we should roll into GeoServer the kind
of java.util.LogManager replacement the Tomcat developers
came up with, JULI (see: http://tomcat.apache.org/tomcat-5.5-doc/logging.html).
The problem is, I think they manage to do so because
they are in control of the classloading, and we are not.
I also don't see any way to force a replacement of LogManager
using the java logging API, LogManager is a class, not an
interface...

I also would like to point out that in GeoServer 1.5.x
we were no better. We were usign only java logging, but
in order to configure the logging level from the UI
we were taking all java logging handlers and configure them
with a single flat level. It worked because it played with
handlers instead of loggers (they are all instantiated at
startup), but nevertheless it was not ok because again
it was a VM wide modification of the logging levels.

I really don't know how to address this. The short term
solution I see:
* go back to a flat setting of the logging level, instead
   of per package (too bad, trying to diagnose an error
   setting the log level of a specific package is very useful)
* build up a java logging property file on the fly and have
   java logging ingest it

Both solutions are very suboptimal in my opinion.
We would not be in such a mess if GeoTools used commons logging.
Martin, are you sure there is no way to have commons logging
report the thread information that you need?

Cheers
Andrea

Hello all

Every issues raided by Andrea are of course legitimates. I will post some
possible paths below.

As Justin point out, commons-logging is widely used in server applications. But
this is one community; other communities (e.g. scientists) use different
frameworks. For example UCAR's NetCDF library uses SLF4J. UCAR provides a JAR
that redirect NetCDF loggings to java.util.logging, but they do not provide a
JAR that redirect them to common-login; we have to get it elswhere (SLF4J web
site provides a migration path *away* from commons-login).

Commons-login is reputed to have class-loader problems that other framework
don't have: http://www.qos.ch/logging/classloader.jsp

There is also the logging events sent by the JVM itself, which use Java logging.
Of course we can decide that JVM logs are not of interest.

Also, are you sure that common-loging would work with Glassfish? This guys seems
to had a lot of trouble getting Log4J to work with Glassfish (please take also a
look at the comments):

http://weblogs.java.net/blog/schaefa/archive/2007/08/to_the_hell_wit_1.html

I would like to quote "whartung"'s comment from the above page. He explain well
my view (again I'm not critizing common-login goals; I question the solution
that Apache choose for reaching those goals):

    "Commons-logging took the wrong approach layering on top of the
     various logging systems, when it should have simply changed the
     plumbing that JDK Logging provides, and then provide a sugary
     layer on top of JDK Logging."

Note that this is actually what JULI started to do (unfortunatly a little bit
late). I believe that JULI is a good approach that commons-loggin should have
taken right from the begining...

--------------------------------------------------------------------

Now some possible approachs:

If we really want to use something like commons-login, maybe we should
investigate first if SLF4J could be a better replacement. Note that SLF4J may
suffer the same problems than common-login regarding JVM and Glassfish loggings.

My personal preference is of course to fix the issues we have with java login.
The easiest fix is to create a java.util.logging.Logger subclass which redirect
everything to commons-login - it is easier and much more straightforward to
implement (so much easier to understand and debug) than the existing
org.geotools.util.CommonHandler and I believe that it would fix most of Andrea's
issue. The only problem (maybe a big one - we need to investigate) is to get our
own LogManager to be registered. We can do that with the following command-line
option:

   -Djava.util.logging.config.class=org.geotools.util.CommonLogManager

For packages other than Geotools and Geoserver, this manager would redirect to
Apache's JULI if present, or the default Java log manager otherwise.

Would it be acceptable to suggest users to put this property configuration? I
would put CommonLogManager and CommonHandler in a very small standalone JAR so
users don't get a big dependency toward Geotools. The CommonHandler would stay
around for providing less-than-ideal (but better than nothing) fallback for
users would don't set this property.

  Martin

Martin Desruisseaux ha scritto:

Hello all

Every issues raided by Andrea are of course legitimates. I will post some
possible paths below.

As Justin point out, commons-logging is widely used in server applications. But
this is one community; other communities (e.g. scientists) use different
frameworks. For example UCAR's NetCDF library uses SLF4J. UCAR provides a JAR
that redirect NetCDF loggings to java.util.logging, but they do not provide a
JAR that redirect them to common-login; we have to get it elswhere (SLF4J web
site provides a migration path *away* from commons-login).

Commons-login is reputed to have class-loader problems that other framework
don't have: http://www.qos.ch/logging/classloader.jsp

I'm aware of that. But it's way less problematic than the VM wide configuration issues that java logging carries, at least in the enterprise java space.

There is also the logging events sent by the JVM itself, which use Java logging.
Of course we can decide that JVM logs are not of interest.

Also, are you sure that common-loging would work with Glassfish? This guys seems
to had a lot of trouble getting Log4J to work with Glassfish (please take also a
look at the comments):

http://weblogs.java.net/blog/schaefa/archive/2007/08/to_the_hell_wit_1.html

I would like to quote "whartung"'s comment from the above page. He explain well
my view (again I'm not critizing common-login goals; I question the solution
that Apache choose for reaching those goals):

    "Commons-logging took the wrong approach layering on top of the
     various logging systems, when it should have simply changed the
     plumbing that JDK Logging provides, and then provide a sugary
     layer on top of JDK Logging."

Note that this is actually what JULI started to do (unfortunatly a little bit
late). I believe that JULI is a good approach that commons-loggin should have
taken right from the begining...

JULI is a tomcat specific thing that requires you to control class
loading. It's not a solution for a library, nor is for GeoServer (which
strives to be container independent and ships, in its binary incarnation, with Jetty).

--------------------------------------------------------------------

Now some possible approachs:

If we really want to use something like commons-login, maybe we should
investigate first if SLF4J could be a better replacement. Note that SLF4J may
suffer the same problems than common-login regarding JVM and Glassfish loggings.

Seeing that Wicket uses it gives me some hope. Yet seeing their site say
that Hibernate uses it whilst it's not true takes away some of the confidence (I've just downloaded the latest hibernate release, still uses commons logging).

My personal preference is of course to fix the issues we have with java login.
The easiest fix is to create a java.util.logging.Logger subclass which redirect
everything to commons-login - it is easier and much more straightforward to
implement (so much easier to understand and debug) than the existing
org.geotools.util.CommonHandler and I believe that it would fix most of Andrea's
issue.

I'm not sure how it would solve the logging level configuration problem and the VM wide parameter settings. For example, how can I have two GeoServer instances running in the same web container but using different logging levels (not a made up case, think having two geoserver inside the same web container, one configured for production and the other for testing).
Can you explain?

The only problem (maybe a big one - we need to investigate) is to get our
own LogManager to be registered. We can do that with the following command-line
option:

   -Djava.util.logging.config.class=org.geotools.util.CommonLogManager

For packages other than Geotools and Geoserver, this manager would redirect to
Apache's JULI if present, or the default Java log manager otherwise.

Would it be acceptable to suggest users to put this property configuration? I
would put CommonLogManager and CommonHandler in a very small standalone JAR so
users don't get a big dependency toward Geotools. The CommonHandler would stay
around for providing less-than-ideal (but better than nothing) fallback for
users would don't set this property.

Having to use the above stuff is not a viable option for the same reasons I've been repeating over and over. GeoServer ships both
as a WAR and as a binary. In the binary version we are stand alone and we could do the above. In the war version GeoServer is supposed to be
included in the web container of users choice (may be tomcat, may be Oracle Application Server) and the above would be a VM wide setting,
and one you may not be able to force in containers such as WebLogic for
example (they are quite fragile vm configuraiton wise usually, provided
you find any way to pass the VM a config parameter to start with).

One problem to consider in whatever solution we choose is how to handle
the switch of the logging API (given that java logging is structurally
broken for web serving, unless you can prove otherwise).
Say we switch to SL4J. We would have to hunt down every single logging
statement and fix it. I was thinking of an easier path.
Say we roll out our replacement classes that are bit by bit compatibile
with java logging method wise:
org.geotools.logging.Logger
org.geotools.logging.Level
Switching to those classes would be a matter of replacing import statements, a task that "sed" could do efficiently and without
mistakes.

Inside those classes we could then relay to SL4J or to log4j or
to java logging. In a sense it would mean making our commons logging
(I don't like this idea very much) but at the same time I have the
impression it would make the switch to whatever we want possible
in a very short time. I don't know, it's just an idea.

What do you think?
Cheers
Andrea

Andrea Aime ha scritto:

Seeing that Wicket uses it gives me some hope. Yet seeing their site say
that Hibernate uses it whilst it's not true takes away some of the confidence (I've just downloaded the latest hibernate release, still uses commons logging).

I have to amend this one. Hibernate has really switched to sl4j, but
this happened only in the source version that will be released as
3.3 whilst the last release is 3.2.5.
Some more info here:
http://opensource.atlassian.com/projects/hibernate/browse/HHH-2696

Cheers
Andrea

Andrea Aime a écrit :

JULI is a tomcat specific thing that requires you to control class
loading. It's not a solution for a library

Sure. I cited JULI only as an illustration of an Apache project following a path
that they could have taken in common-logging.

[...about improving our redirection to java logging...]

I'm not sure how it would solve the logging level configuration problem

We could override java.util.logging.Logger that way:

class CommonLogger extends Logger {
    private final Log commonLogger = ...

    @Override
    public void info(String message) {
        commonLogger.info(message);
    }
    // etc...
}

so we bypass completly all Java logging checks. This is equivalent to the
solution you proposed in your previous email (set java logging to FINEST or ALL
and lets common logger do the filtering itself), but more efficiently since we
pass the call immediately to common-logger.

This redirection would be performed only for Geotools and Geoserver logger (or
any other logger listed by the users). Other loggers would stay the default ones.

The only condition for being able to to that is to get our own LogManager class
instantiated at JVM startup time, i.e. being able to ask to users to provide the
-D option mentioned in my previous email.

For example, how can I have two
GeoServer instances running in the same web container but using
different logging levels

We have a choice:
- This is not possible with the default java logging as it currently
  stand in Java 6.

- It is possible with JULI extension to java logging, which was
  created especially for that purpose in my understanding.

- I think (but I'm not sure) that it would be possible with my above-cited
  proposal without JULI since the log would be forwarded directly to common-
  logging with all java logging code bypassed. If it doesn't work for some
  reason, I think it would work with my proposal + JULI.

Say we roll out our replacement classes that are bit by bit compatibile
with java logging method wise:
org.geotools.logging.Logger
org.geotools.logging.Level
Switching to those classes would be a matter of replacing import
statements, a task that "sed" could do efficiently and without
mistakes.

Inside those classes we could then relay to SL4J or to log4j or
to java logging.

It may be the best compromise. Actually, there is no need to create
org.geotools.logging.Logger/Level class. We could do the following:

- Create the above cited CommonLogger class as in my above proposal
  (package-privated class).

- Adds the following method in the current org.geotools.util.Logging
  class: getLogger(String).

- Replaces every occurence of
      java.util.logging.Logger.getLogger(name);
  by
      org.geotools.util.Logging.getLogger(name);
  in the GeoTools code base. Remainding API (Logger, LogRecord, Level...)
  stay unchanged.

- Maybe delete CommonHandler if we don't need it anymore.

It would produce the same effect than my above proposal, except that there is no
need to register a LogManager if all GeoTools code get the logger through a call
to Logging.getLogger instead of Logger.getLogger.

What do you think?

  Martin

Martin Desruisseaux ha scritto:

Andrea Aime a écrit :

I'm not sure how it would solve the logging level configuration problem

We could override java.util.logging.Logger that way:

class CommonLogger extends Logger {
    private final Log commonLogger = ...

    @Override
    public void info(String message) {
        commonLogger.info(message);
    }
    // etc...
}

so we bypass completly all Java logging checks. This is equivalent to the
solution you proposed in your previous email (set java logging to FINEST or ALL
and lets common logger do the filtering itself), but more efficiently since we
pass the call immediately to common-logger.

That would not solve the problem. The problem is that building logging
messages is expensive, so lots of code has stuff like:

if(LOGGER.isLoggable(Level.FINE)) {
   // do expensive log message building
}

That common logger class would have to work against a user provided set of levels (the ones we grab by parsing the log4j file). Would it be possible?

This redirection would be performed only for Geotools and Geoserver logger (or
any other logger listed by the users). Other loggers would stay the default ones.

The only condition for being able to to that is to get our own LogManager class
instantiated at JVM startup time, i.e. being able to ask to users to provide the
-D option mentioned in my previous email.

That's not possible, this is J2EE, we cannot do anything that is VM wide, only stuff that can be done:
a) while the vm is already running
b) that affects only the current web application and leaves the other
    web applications running in the same VM alone.

For example, how can I have two
GeoServer instances running in the same web container but using
different logging levels

We have a choice:
- This is not possible with the default java logging as it currently
  stand in Java 6.

- It is possible with JULI extension to java logging, which was
  created especially for that purpose in my understanding.

- I think (but I'm not sure) that it would be possible with my above-cited
  proposal without JULI since the log would be forwarded directly to common-
  logging with all java logging code bypassed. If it doesn't work for some
  reason, I think it would work with my proposal + JULI.

Martin, I have the impression I'm not getting my point thru. Whatever
VM wide solution you come up with is not acceptable. Whatever solution
that requires to play with class loaders is not acceptable either.

Say we roll out our replacement classes that are bit by bit compatibile
with java logging method wise:
org.geotools.logging.Logger
org.geotools.logging.Level
Switching to those classes would be a matter of replacing import
statements, a task that "sed" could do efficiently and without
mistakes.

Inside those classes we could then relay to SL4J or to log4j or
to java logging.

It may be the best compromise. Actually, there is no need to create
org.geotools.logging.Logger/Level class. We could do the following:

- Create the above cited CommonLogger class as in my above proposal
  (package-privated class).

- Adds the following method in the current org.geotools.util.Logging
  class: getLogger(String).

- Replaces every occurence of
      java.util.logging.Logger.getLogger(name);
  by
      org.geotools.util.Logging.getLogger(name);
  in the GeoTools code base. Remainding API (Logger, LogRecord, Level...)
  stay unchanged.

- Maybe delete CommonHandler if we don't need it anymore.

It would produce the same effect than my above proposal, except that there is no
need to register a LogManager if all GeoTools code get the logger through a call
to Logging.getLogger instead of Logger.getLogger.

What do you think?

Seems like a good solution... is there any way I can control the creation of those loggers so to assign the right logging level?
They need to answer to LOGGER.isLoggable(Level.XXX) in a way that
reflects the log4j configuraton file.

Cheers
Andrea

Andrea Aime a écrit :

That would not solve the problem. The problem is that building logging
messages is expensive, so lots of code has stuff like:

if(LOGGER.isLoggable(Level.FINE)) {
  // do expensive log message building
}

Thats the same. When I wrote "override java.util.logging.Logger" methods, I mean
override all of them, including isLoggable (pseudo-code, I don't remember the
exact API):

    boolean isLoggable(Level level) {
        switch (level) {
            case INFO: return commonLog.canLogInfo();
            // etc.
        }
    }

Martin, I have the impression I'm not getting my point thru. Whatever
VM wide solution you come up with is not acceptable. Whatever solution
that requires to play with class loaders is not acceptable either.

The VM-wide stuff is at LogManager level only - everything else is not VM wide.
If specifying a -Djava.util.logging.config.class is not acceptable, we can
ignore LogManager and use a custom 'getLogger' methods in Geotools. Everything
else can stay unchanged.

Seems like a good solution... is there any way I can control the
creation of those loggers so to assign the right logging level?
They need to answer to LOGGER.isLoggable(Level.XXX) in a way that
reflects the log4j configuraton file.

There is no need to assing their logging level. See above code snippet.
Everything is redirected to common-logging, including 'isLoggable'.

  Martin

So should I open a JIRA task for "java.util.logger.Logger subclass that redirect
everything to common-logging, including isLoggeable", or do we want to
investigate other proposals?

  Martin

Martin Desruisseaux ha scritto:

So should I open a JIRA task for "java.util.logger.Logger subclass that redirect
everything to common-logging, including isLoggeable", or do we want to
investigate other proposals?

It seems a satisfactory solution. No VM wide configuration, uses logging
levels coming from commons-logging. Seems a win to me as long as there is no need to do anything VM wide.

I'm ok as long as we use a custom getLogger method in Geotools so that
we don't need to pass anything to the VM.

So to sum up and check if I understand properly:
* we create a java.util.logger.Logger subclass, CommonLogger, that
   delegates everything to commons logging
* we switch all Logger.getLogger(xxx) methods to a special gt2
   utility class that will return either the plain java logging
   logger, or the CommonLogger, depending on a configuration
* we'll have some way to switch gt2 logging one way or the other
   (and the same will have to work for GeoServer as well).

Given that commons logging and log4j are well isolated
classloader wise it may even be that two GeoServer instances running
in the same container will be able to work with separate logging levels... oh well, at least this solution has the advantage of
not touching the rest of java logging.

One question still... this approach will preserve java logging handlers right? This may speak trouble. By default java logging is configured
to log onto the console, and log4j too.. won't this mean we'll get
all messages that are loggable according to both subsystems twice
on the console?

Cheers
Andrea

Andrea Aime ha scritto:

One question still... this approach will preserve java logging handlers right? This may speak trouble. By default java logging is configured
to log onto the console, and log4j too.. won't this mean we'll get
all messages that are loggable according to both subsystems twice
on the console?

Answering myself, no, it won't happen because CommonLogger won't call
java logging at all.
Cheers
Andrea

Andrea Aime a écrit :

So to sum up and check if I understand properly:
* we create a java.util.logger.Logger subclass, CommonLogger, that
  delegates everything to commons logging
* we switch all Logger.getLogger(xxx) methods to a special gt2
  utility class that will return either the plain java logging
  logger, or the CommonLogger, depending on a configuration
* we'll have some way to switch gt2 logging one way or the other
  (and the same will have to work for GeoServer as well).

Yes exactly. And as you said in your next email, CommonLogger would not invoke
java logging at all.

So I will open a JIRA task in a few hours, unless there is objection.

  Martin