[GeoNetwork-devel] database connections GN 2.4.1

Hi

We have a separate DB machine, running mysql, and geonetwork connects to it. We are running GeoNetwork 2.4.1.

The DB had a hard outage – lasting about 15mins, from the logs it looks like geonetwork tried to reconnect 3 times, then gave up and didn’t give any intelligent output on the gui

The logs have been ‘over summarised’ – no stack traces in the db attempt connect snippet.

How can we make geonetwork keep trying to reconnect?

How can we make geonetwork tell the user that something has gone wrong in a more intelligent form? (our monitoring software – nagios should also be able to detect if its broken)

Regards,

Terry Rankine

2010-02-16 06:21:55,748 ERROR [jeeves.engine] - Raised exception while initializing resource. Skipped.

2010-02-16 06:21:55,750 ERROR [jeeves.engine] - Resource : main-db

2010-02-16 06:21:55,750 ERROR [jeeves.engine] - Provider : jeeves.resources.dbms.DbmsPool

2010-02-16 06:21:55,750 ERROR [jeeves.engine] - Exception : com.mysql.jdbc.CommunicationsException: Communications link failure due to underlying exception:

** BEGIN NESTED EXCEPTION **

java.net.ConnectException

MESSAGE: Connection refused

STACKTRACE:

java.net.ConnectException: Connection refused

at java.net.PlainSocketImpl.socketConnect(Native Method)

Last packet sent to the server was 1 ms ago.

2010-02-16 06:21:55,751 ERROR [jeeves.engine] - Message : Communications link failure due to underlying exception:

** BEGIN NESTED EXCEPTION **

java.net.ConnectException

MESSAGE: Connection refused

STACKTRACE:

java.net.ConnectException: Connection refused

Last packet sent to the server was 1 ms ago.

2010-02-16 06:21:55,751 ERROR [jeeves.engine] - Stack : com.mysql.jdbc.CommunicationsException: Communications link failure due to underlying exception:

** BEGIN NESTED EXCEPTION **

java.net.ConnectException

MESSAGE: Connection refused

STACKTRACE:

java.net.ConnectException: Connection refused

Last packet sent to the server was 1 ms ago.

at com.mysql.jdbc.Connection.createNewIO(Connection.java:2820)

at com.mysql.jdbc.Connection.(Connection.java:1553)

2010-02-16 06:21:56,417 ERROR [jeeves.engine] - Raised exception while starting appl handler. Skipped.

2010-02-16 06:21:56,417 ERROR [jeeves.engine] - Handler : org.fao.geonet.Geonetwork

2010-02-16 06:21:56,417 ERROR [jeeves.engine] - Exception : java.lang.NullPointerException

2010-02-16 06:21:56,417 ERROR [jeeves.engine] - Message : null

2010-02-16 06:21:56,417 ERROR [jeeves.engine] - Stack : java.lang.NullPointerException

at jeeves.server.resources.ResourceManager.open(ResourceManager.java:68)

at org.fao.geonet.Geonetwork.start(Geonetwork.java:88)

</ db-attempt-connect>

2010-02-16 08:30:29,141 ERROR [jeeves.service] - Exception when executing service

2010-02-16 08:30:29,155 ERROR [jeeves.service] - (C) Exc : java.lang.NullPointerException

2010-02-16 08:30:29,165 ERROR [jeeves.service] - Exception executing gui service : java.lang.NullPointerException

2010-02-16 08:30:29,166 ERROR [jeeves.service] - (C) Stack trace is :

java.lang.NullPointerException

at org.fao.geonet.guiservices.util.Env.exec(Env.java:53)

at jeeves.server.dispatchers.guiservices.Call.exec(Call.java:75)

at jeeves.server.dispatchers.AbstractPage.invokeGuiService(AbstractPage.java:119)

at jeeves.server.dispatchers.AbstractPage.invokeGuiServices(AbstractPage.java:103)

at jeeves.server.dispatchers.ServiceManager.dispatchError(ServiceManager.java:724)

at jeeves.server.dispatchers.ServiceManager.handleError(ServiceManager.java:465)

at jeeves.server.dispatchers.ServiceManager.dispatch(ServiceManager.java:410)

at jeeves.server.JeevesEngine.dispatch(JeevesEngine.java:621)

at jeeves.server.sources.http.JeevesServlet.execute(JeevesServlet.java:174)

at jeeves.server.sources.http.JeevesServlet.doGet(JeevesServlet.java:89)

at javax.servlet.http.HttpServlet.service(HttpServlet.java:617)

at javax.servlet.http.HttpServlet.service(HttpServlet.java:717)

at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:290)

at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)

at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233)

at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191)

at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:128)

at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102)

at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109)

at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:293)

at org.apache.jk.server.JkCoyoteHandler.invoke(JkCoyoteHandler.java:190)

at org.apache.jk.common.HandlerRequest.invoke(HandlerRequest.java:291)

at org.apache.jk.common.ChannelSocket.invoke(ChannelSocket.java:769)

at org.apache.jk.common.ChannelSocket.processConnection(ChannelSocket.java:698)

at org.apache.jk.common.ChannelSocket$SocketConnection.runIt(ChannelSocket.java:891)

at org.apache.tomcat.util.threads.ThreadPool$ControlRunnable.run(ThreadPool.java:690)

at java.lang.Thread.run(Thread.java:619)

Hi Terry,

I'm not sure you would really expect any intelligent output in the form of user diagnostics from the guiservices (in your unintelligent-log) as these are just helpers that (typically) provide additional info (eg. list of z39 repositories etc) to be used by XSLTs (via an xpath such as /root/gui/z3950repositories) that present html output from a service to the user.

Do you think that the failure to intialize the DbmsPool via the ResourceManager (first exception in your db-attempt-connect-log) and/or the failure to open a Dbms using the resource manager (null pointer exception, the second one in your db-attempt-connect log) during GeoNetwork start up should cause the server to halt? Mine own feeling is yes as there are other initialization tasks that rely on the pool being initialized and at least one connection to the database being open.

As to what should/will happen if the database disappears sometime after startup (and possibly during processing) then its a matter of examining DbmsPool.java and Dbms.java in jeeves/src/jeeves/resources/dbms. At the moment it looks to me that some reconnect attempts may be made in the open method of DbmsPool.java (eg. if the connection to the Dbms has closed or reconnect time in the pool parameters section of web/geonetwork/WEB-INF/config.xml is not zero and is exceeded) but its not completely clear what happens if the connection to database cannot be re-opened - more analysis required.

Hope this helps at least a bit!

Cheers,
Simon

________________________________________
From: Terry.Rankine@anonymised.com [Terry.Rankine@anonymised.com]
Sent: Tuesday, 16 February 2010 1:33 PM
To: geonetwork-devel@lists.sourceforge.net
Subject: [ExternalEmail] [GeoNetwork-devel] database connections GN 2.4.1

Hi

We have a separate DB machine, running mysql, and geonetwork connects to it. We are running GeoNetwork 2.4.1.

The DB had a hard outage – lasting about 15mins, from the logs it looks like geonetwork tried to reconnect 3 times, then gave up and didn’t give any intelligent output on the gui

The logs have been ‘over summarised’ – no stack traces in the db attempt connect snippet.

How can we make geonetwork keep trying to reconnect?
How can we make geonetwork tell the user that something has gone wrong in a more intelligent form? (our monitoring software – nagios should also be able to detect if its broken)

Regards,
Terry Rankine

<db-attempt-connect>
2010-02-16 06:21:55,748 ERROR [jeeves.engine] - Raised exception while initializing resource. Skipped.
2010-02-16 06:21:55,750 ERROR [jeeves.engine] - Resource : main-db
2010-02-16 06:21:55,750 ERROR [jeeves.engine] - Provider : jeeves.resources.dbms.DbmsPool
2010-02-16 06:21:55,750 ERROR [jeeves.engine] - Exception : com.mysql.jdbc.CommunicationsException: Communications link failure due to underlying exception:
** BEGIN NESTED EXCEPTION **
java.net.ConnectException
MESSAGE: Connection refused
STACKTRACE:
java.net.ConnectException: Connection refused
                at java.net.PlainSocketImpl.socketConnect(Native Method)
Last packet sent to the server was 1 ms ago.
2010-02-16 06:21:55,751 ERROR [jeeves.engine] - Message : Communications link failure due to underlying exception:
** BEGIN NESTED EXCEPTION **
java.net.ConnectException
MESSAGE: Connection refused
STACKTRACE:
java.net.ConnectException: Connection refused
Last packet sent to the server was 1 ms ago.
2010-02-16 06:21:55,751 ERROR [jeeves.engine] - Stack : com.mysql.jdbc.CommunicationsException: Communications link failure due to underlying exception:
** BEGIN NESTED EXCEPTION **
java.net.ConnectException
MESSAGE: Connection refused
STACKTRACE:
java.net.ConnectException: Connection refused
Last packet sent to the server was 1 ms ago.
                at com.mysql.jdbc.Connection.createNewIO(Connection.java:2820)
                at com.mysql.jdbc.Connection.<init>(Connection.java:1553)
2010-02-16 06:21:56,417 ERROR [jeeves.engine] - Raised exception while starting appl handler. Skipped.
2010-02-16 06:21:56,417 ERROR [jeeves.engine] - Handler : org.fao.geonet.Geonetwork
2010-02-16 06:21:56,417 ERROR [jeeves.engine] - Exception : java.lang.NullPointerException
2010-02-16 06:21:56,417 ERROR [jeeves.engine] - Message : null
2010-02-16 06:21:56,417 ERROR [jeeves.engine] - Stack : java.lang.NullPointerException
                at jeeves.server.resources.ResourceManager.open(ResourceManager.java:68)
                at org.fao.geonet.Geonetwork.start(Geonetwork.java:88)
</ db-attempt-connect>

<unintelligent-log>
2010-02-16 08:30:29,141 ERROR [jeeves.service] - Exception when executing service
2010-02-16 08:30:29,155 ERROR [jeeves.service] - (C) Exc : java.lang.NullPointerException
2010-02-16 08:30:29,165 ERROR [jeeves.service] - Exception executing gui service : java.lang.NullPointerException
2010-02-16 08:30:29,166 ERROR [jeeves.service] - (C) Stack trace is :
java.lang.NullPointerException
                at org.fao.geonet.guiservices.util.Env.exec(Env.java:53)
                at jeeves.server.dispatchers.guiservices.Call.exec(Call.java:75)
                at jeeves.server.dispatchers.AbstractPage.invokeGuiService(AbstractPage.java:119)
                at jeeves.server.dispatchers.AbstractPage.invokeGuiServices(AbstractPage.java:103)
                at jeeves.server.dispatchers.ServiceManager.dispatchError(ServiceManager.java:724)
                at jeeves.server.dispatchers.ServiceManager.handleError(ServiceManager.java:465)
                at jeeves.server.dispatchers.ServiceManager.dispatch(ServiceManager.java:410)
                at jeeves.server.JeevesEngine.dispatch(JeevesEngine.java:621)
                at jeeves.server.sources.http.JeevesServlet.execute(JeevesServlet.java:174)
                at jeeves.server.sources.http.JeevesServlet.doGet(JeevesServlet.java:89)
                at javax.servlet.http.HttpServlet.service(HttpServlet.java:617)
                at javax.servlet.http.HttpServlet.service(HttpServlet.java:717)
                at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:290)
                at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
                at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233)
                at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191)
                at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:128)
                at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102)
                at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109)
                at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:293)
                at org.apache.jk.server.JkCoyoteHandler.invoke(JkCoyoteHandler.java:190)
                at org.apache.jk.common.HandlerRequest.invoke(HandlerRequest.java:291)
                at org.apache.jk.common.ChannelSocket.invoke(ChannelSocket.java:769)
                at org.apache.jk.common.ChannelSocket.processConnection(ChannelSocket.java:698)
                at org.apache.jk.common.ChannelSocket$SocketConnection.runIt(ChannelSocket.java:891)
                at org.apache.tomcat.util.threads.ThreadPool$ControlRunnable.run(ThreadPool.java:690)
                at java.lang.Thread.run(Thread.java:619)
</unintelligent-log>

Hi Simon

There are too many web applications which put up errors like
"Oooops - it looks like I have lost connection to my database". I guess that not every exception is good to show an end user - but I am convinced that things that make the application stop - should make it back via the GUI telling the user that its broken.

As for what to do, If you can't reach the DB - then load up the error page (not a http 200 response), with the error. Blocking the app from starting - I am pretty sure that it's a preference thing. Our preference would be we want the webapp to load, and then tell us that it can't connect to the db, rather than just blocking the webapp from starting. But the other argument is valid too

For our immediate needs - in the xml config - can we specify connection retries or other 'try again' options? Or does this have to go into the java src (not a preferred option).

Terry Rankine

-----Original Message-----
From: Pigot, Simon (CMAR, Hobart)
Sent: Tuesday, 16 February 2010 8:41 PM
To: Rankine, Terry (CESRE, Kensington); geonetwork-devel@anonymised.comnet
Subject: RE: database connections GN 2.4.1

Hi Terry,

I'm not sure you would really expect any intelligent output in the form of user diagnostics from the guiservices (in your unintelligent-log) as these are just helpers that (typically) provide additional info (eg. list of z39 repositories etc) to be used by XSLTs (via an xpath such as /root/gui/z3950repositories) that present html output from a service to the user.

Do you think that the failure to intialize the DbmsPool via the ResourceManager (first exception in your db-attempt-connect-log) and/or the failure to open a Dbms using the resource manager (null pointer exception, the second one in your db-attempt-connect log) during GeoNetwork start up should cause the server to halt? Mine own feeling is yes as there are other initialization tasks that rely on the pool being initialized and at least one connection to the database being open.

As to what should/will happen if the database disappears sometime after startup (and possibly during processing) then its a matter of examining DbmsPool.java and Dbms.java in jeeves/src/jeeves/resources/dbms. At the moment it looks to me that some reconnect attempts may be made in the open method of DbmsPool.java (eg. if the connection to the Dbms has closed or reconnect time in the pool parameters section of web/geonetwork/WEB-INF/config.xml is not zero and is exceeded) but its not completely clear what happens if the connection to database cannot be re-opened - more analysis required.

Hope this helps at least a bit!

Cheers,
Simon

________________________________________
From: Terry.Rankine@anonymised.com [Terry.Rankine@anonymised.com]
Sent: Tuesday, 16 February 2010 1:33 PM
To: geonetwork-devel@lists.sourceforge.net
Subject: [ExternalEmail] [GeoNetwork-devel] database connections GN 2.4.1

Hi

We have a separate DB machine, running mysql, and geonetwork connects to it. We are running GeoNetwork 2.4.1.

The DB had a hard outage - lasting about 15mins, from the logs it looks like geonetwork tried to reconnect 3 times, then gave up and didn't give any intelligent output on the gui

The logs have been 'over summarised' - no stack traces in the db attempt connect snippet.

How can we make geonetwork keep trying to reconnect?
How can we make geonetwork tell the user that something has gone wrong in a more intelligent form? (our monitoring software - nagios should also be able to detect if its broken)

Regards,
Terry Rankine

<db-attempt-connect>
2010-02-16 06:21:55,748 ERROR [jeeves.engine] - Raised exception while initializing resource. Skipped.
2010-02-16 06:21:55,750 ERROR [jeeves.engine] - Resource : main-db
2010-02-16 06:21:55,750 ERROR [jeeves.engine] - Provider : jeeves.resources.dbms.DbmsPool
2010-02-16 06:21:55,750 ERROR [jeeves.engine] - Exception : com.mysql.jdbc.CommunicationsException: Communications link failure due to underlying exception:
** BEGIN NESTED EXCEPTION **
java.net.ConnectException
MESSAGE: Connection refused
STACKTRACE:
java.net.ConnectException: Connection refused
                at java.net.PlainSocketImpl.socketConnect(Native Method)
Last packet sent to the server was 1 ms ago.
2010-02-16 06:21:55,751 ERROR [jeeves.engine] - Message : Communications link failure due to underlying exception:
** BEGIN NESTED EXCEPTION **
java.net.ConnectException
MESSAGE: Connection refused
STACKTRACE:
java.net.ConnectException: Connection refused
Last packet sent to the server was 1 ms ago.
2010-02-16 06:21:55,751 ERROR [jeeves.engine] - Stack : com.mysql.jdbc.CommunicationsException: Communications link failure due to underlying exception:
** BEGIN NESTED EXCEPTION **
java.net.ConnectException
MESSAGE: Connection refused
STACKTRACE:
java.net.ConnectException: Connection refused
Last packet sent to the server was 1 ms ago.
                at com.mysql.jdbc.Connection.createNewIO(Connection.java:2820)
                at com.mysql.jdbc.Connection.<init>(Connection.java:1553)
2010-02-16 06:21:56,417 ERROR [jeeves.engine] - Raised exception while starting appl handler. Skipped.
2010-02-16 06:21:56,417 ERROR [jeeves.engine] - Handler : org.fao.geonet.Geonetwork
2010-02-16 06:21:56,417 ERROR [jeeves.engine] - Exception : java.lang.NullPointerException
2010-02-16 06:21:56,417 ERROR [jeeves.engine] - Message : null
2010-02-16 06:21:56,417 ERROR [jeeves.engine] - Stack : java.lang.NullPointerException
                at jeeves.server.resources.ResourceManager.open(ResourceManager.java:68)
                at org.fao.geonet.Geonetwork.start(Geonetwork.java:88)
</ db-attempt-connect>

<unintelligent-log>
2010-02-16 08:30:29,141 ERROR [jeeves.service] - Exception when executing service
2010-02-16 08:30:29,155 ERROR [jeeves.service] - (C) Exc : java.lang.NullPointerException
2010-02-16 08:30:29,165 ERROR [jeeves.service] - Exception executing gui service : java.lang.NullPointerException
2010-02-16 08:30:29,166 ERROR [jeeves.service] - (C) Stack trace is :
java.lang.NullPointerException
                at org.fao.geonet.guiservices.util.Env.exec(Env.java:53)
                at jeeves.server.dispatchers.guiservices.Call.exec(Call.java:75)
                at jeeves.server.dispatchers.AbstractPage.invokeGuiService(AbstractPage.java:119)
                at jeeves.server.dispatchers.AbstractPage.invokeGuiServices(AbstractPage.java:103)
                at jeeves.server.dispatchers.ServiceManager.dispatchError(ServiceManager.java:724)
                at jeeves.server.dispatchers.ServiceManager.handleError(ServiceManager.java:465)
                at jeeves.server.dispatchers.ServiceManager.dispatch(ServiceManager.java:410)
                at jeeves.server.JeevesEngine.dispatch(JeevesEngine.java:621)
                at jeeves.server.sources.http.JeevesServlet.execute(JeevesServlet.java:174)
                at jeeves.server.sources.http.JeevesServlet.doGet(JeevesServlet.java:89)
                at javax.servlet.http.HttpServlet.service(HttpServlet.java:617)
                at javax.servlet.http.HttpServlet.service(HttpServlet.java:717)
                at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:290)
                at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
                at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233)
                at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191)
                at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:128)
                at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102)
                at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109)
                at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:293)
                at org.apache.jk.server.JkCoyoteHandler.invoke(JkCoyoteHandler.java:190)
                at org.apache.jk.common.HandlerRequest.invoke(HandlerRequest.java:291)
                at org.apache.jk.common.ChannelSocket.invoke(ChannelSocket.java:769)
                at org.apache.jk.common.ChannelSocket.processConnection(ChannelSocket.java:698)
                at org.apache.jk.common.ChannelSocket$SocketConnection.runIt(ChannelSocket.java:891)
                at org.apache.tomcat.util.threads.ThreadPool$ControlRunnable.run(ThreadPool.java:690)
                at java.lang.Thread.run(Thread.java:619)
</unintelligent-log>

Hi Terry,

Terry.Rankine@anonymised.com wrote:

Hi Simon

There are too many web applications which put up errors like
"Oooops - it looks like I have lost connection to my database". I guess that not every exception is good to show an end user - but I am convinced that things that make the application stop - should make it back via the GUI telling the user that its broken.
  

Yep - I agree.

As for what to do, If you can't reach the DB - then load up the error page (not a http 200 response), with the error. Blocking the app from starting - I am pretty sure that it's a preference thing. Our preference would be we want the webapp to load, and then tell us that it can't connect to the db, rather than just blocking the webapp from starting. But the other argument is valid too
  

Yep. Re: the other argument: expecting GeoNetwork admins to be able to examine log files for errors is ok if the GeoNetwork admin has access to the machines running the web app - this cannot be assumed. If these things are reported in the log files, the ability to examine some events from the log files through the web interface (cross over with a current proposal by Archie Warnock/Doug Nebert perhaps?) would be helpful.

For our immediate needs - in the xml config - can we specify connection retries or other 'try again' options? Or does this have to go into the java src (not a preferred option).
  

You can set the following variables in the database config part of web/geonetwork/WEB-INF/config.xml (there are examples of the others in the config file already):

maxTries: how many times the pool of database connections is iterated to find a connection that isn't in use (locked): default: 20
maxWait: time between each iteration of the entire database connection pool: default: 200 msecs
reconnectTime: when a free (unlocked) database connection is found (during an iteration of the database connection pool), if its been connected longer than the reconnectTime, it is closed and then re-connected: default: 0 (never reconnect) - note though that if the database connection is found to be closed then reconnectTime is set to 1 and the connection is re-connected.

But more analysis is needed eg. will the database connection will be found in a closed state if the database disappears during processing, thus automatically triggering a reconnect? What happens when and if an attempt to reconnect fails?

Cheers,
Simon

Terry Rankine

-----Original Message-----
From: Pigot, Simon (CMAR, Hobart) Sent: Tuesday, 16 February 2010 8:41 PM
To: Rankine, Terry (CESRE, Kensington); geonetwork-devel@lists.sourceforge.net
Subject: RE: database connections GN 2.4.1

Hi Terry,

I'm not sure you would really expect any intelligent output in the form of user diagnostics from the guiservices (in your unintelligent-log) as these are just helpers that (typically) provide additional info (eg. list of z39 repositories etc) to be used by XSLTs (via an xpath such as /root/gui/z3950repositories) that present html output from a service to the user.

Do you think that the failure to intialize the DbmsPool via the ResourceManager (first exception in your db-attempt-connect-log) and/or the failure to open a Dbms using the resource manager (null pointer exception, the second one in your db-attempt-connect log) during GeoNetwork start up should cause the server to halt? Mine own feeling is yes as there are other initialization tasks that rely on the pool being initialized and at least one connection to the database being open.

As to what should/will happen if the database disappears sometime after startup (and possibly during processing) then its a matter of examining DbmsPool.java and Dbms.java in jeeves/src/jeeves/resources/dbms. At the moment it looks to me that some reconnect attempts may be made in the open method of DbmsPool.java (eg. if the connection to the Dbms has closed or reconnect time in the pool parameters section of web/geonetwork/WEB-INF/config.xml is not zero and is exceeded) but its not completely clear what happens if the connection to database cannot be re-opened - more analysis required.

Hope this helps at least a bit!

Cheers,
Simon

________________________________________
From: Terry.Rankine@anonymised.com [Terry.Rankine@anonymised.com]
Sent: Tuesday, 16 February 2010 1:33 PM
To: geonetwork-devel@lists.sourceforge.net
Subject: [ExternalEmail] [GeoNetwork-devel] database connections GN 2.4.1

Hi

We have a separate DB machine, running mysql, and geonetwork connects to it. We are running GeoNetwork 2.4.1.

The DB had a hard outage - lasting about 15mins, from the logs it looks like geonetwork tried to reconnect 3 times, then gave up and didn't give any intelligent output on the gui

The logs have been 'over summarised' - no stack traces in the db attempt connect snippet.

How can we make geonetwork keep trying to reconnect?
How can we make geonetwork tell the user that something has gone wrong in a more intelligent form? (our monitoring software - nagios should also be able to detect if its broken)

Regards,
Terry Rankine

<db-attempt-connect>
2010-02-16 06:21:55,748 ERROR [jeeves.engine] - Raised exception while initializing resource. Skipped.
2010-02-16 06:21:55,750 ERROR [jeeves.engine] - Resource : main-db
2010-02-16 06:21:55,750 ERROR [jeeves.engine] - Provider : jeeves.resources.dbms.DbmsPool
2010-02-16 06:21:55,750 ERROR [jeeves.engine] - Exception : com.mysql.jdbc.CommunicationsException: Communications link failure due to underlying exception:
** BEGIN NESTED EXCEPTION **
java.net.ConnectException
MESSAGE: Connection refused
STACKTRACE:
java.net.ConnectException: Connection refused
                at java.net.PlainSocketImpl.socketConnect(Native Method)
Last packet sent to the server was 1 ms ago.
2010-02-16 06:21:55,751 ERROR [jeeves.engine] - Message : Communications link failure due to underlying exception:
** BEGIN NESTED EXCEPTION **
java.net.ConnectException
MESSAGE: Connection refused
STACKTRACE:
java.net.ConnectException: Connection refused
Last packet sent to the server was 1 ms ago.
2010-02-16 06:21:55,751 ERROR [jeeves.engine] - Stack : com.mysql.jdbc.CommunicationsException: Communications link failure due to underlying exception:
** BEGIN NESTED EXCEPTION **
java.net.ConnectException
MESSAGE: Connection refused
STACKTRACE:
java.net.ConnectException: Connection refused
Last packet sent to the server was 1 ms ago.
                at com.mysql.jdbc.Connection.createNewIO(Connection.java:2820)
                at com.mysql.jdbc.Connection.<init>(Connection.java:1553)
2010-02-16 06:21:56,417 ERROR [jeeves.engine] - Raised exception while starting appl handler. Skipped.
2010-02-16 06:21:56,417 ERROR [jeeves.engine] - Handler : org.fao.geonet.Geonetwork
2010-02-16 06:21:56,417 ERROR [jeeves.engine] - Exception : java.lang.NullPointerException
2010-02-16 06:21:56,417 ERROR [jeeves.engine] - Message : null
2010-02-16 06:21:56,417 ERROR [jeeves.engine] - Stack : java.lang.NullPointerException
                at jeeves.server.resources.ResourceManager.open(ResourceManager.java:68)
                at org.fao.geonet.Geonetwork.start(Geonetwork.java:88)
</ db-attempt-connect>

<unintelligent-log>
2010-02-16 08:30:29,141 ERROR [jeeves.service] - Exception when executing service
2010-02-16 08:30:29,155 ERROR [jeeves.service] - (C) Exc : java.lang.NullPointerException
2010-02-16 08:30:29,165 ERROR [jeeves.service] - Exception executing gui service : java.lang.NullPointerException
2010-02-16 08:30:29,166 ERROR [jeeves.service] - (C) Stack trace is :
java.lang.NullPointerException
                at org.fao.geonet.guiservices.util.Env.exec(Env.java:53)
                at jeeves.server.dispatchers.guiservices.Call.exec(Call.java:75)
                at jeeves.server.dispatchers.AbstractPage.invokeGuiService(AbstractPage.java:119)
                at jeeves.server.dispatchers.AbstractPage.invokeGuiServices(AbstractPage.java:103)
                at jeeves.server.dispatchers.ServiceManager.dispatchError(ServiceManager.java:724)
                at jeeves.server.dispatchers.ServiceManager.handleError(ServiceManager.java:465)
                at jeeves.server.dispatchers.ServiceManager.dispatch(ServiceManager.java:410)
                at jeeves.server.JeevesEngine.dispatch(JeevesEngine.java:621)
                at jeeves.server.sources.http.JeevesServlet.execute(JeevesServlet.java:174)
                at jeeves.server.sources.http.JeevesServlet.doGet(JeevesServlet.java:89)
                at javax.servlet.http.HttpServlet.service(HttpServlet.java:617)
                at javax.servlet.http.HttpServlet.service(HttpServlet.java:717)
                at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:290)
                at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
                at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233)
                at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191)
                at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:128)
                at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102)
                at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109)
                at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:293)
                at org.apache.jk.server.JkCoyoteHandler.invoke(JkCoyoteHandler.java:190)
                at org.apache.jk.common.HandlerRequest.invoke(HandlerRequest.java:291)
                at org.apache.jk.common.ChannelSocket.invoke(ChannelSocket.java:769)
                at org.apache.jk.common.ChannelSocket.processConnection(ChannelSocket.java:698)
                at org.apache.jk.common.ChannelSocket$SocketConnection.runIt(ChannelSocket.java:891)
                at org.apache.tomcat.util.threads.ThreadPool$ControlRunnable.run(ThreadPool.java:690)
                at java.lang.Thread.run(Thread.java:619)
</unintelligent-log>

------------------------------------------------------------------------------
SOLARIS 10 is the OS for Data Centers - provides features such as DTrace,
Predictive Self Healing and Award Winning ZFS. Get Solaris 10 NOW
http://p.sf.net/sfu/solaris-dev2dev
_______________________________________________
GeoNetwork-devel mailing list
GeoNetwork-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/geonetwork-devel
GeoNetwork OpenSource is maintained at http://sourceforge.net/projects/geonetwork