(this cross-posted to pgcluster-general and geoserver-users lists)
For the last few days I am trying my best to implement a Geoserver that uses PGCluster as the backend. With just postgis everything is fine but with PGCluster, many database connections gets hung. Sometime I see lot of “idle in transaction” threads. I have two physical servers, each running a replicator, load balancer and cluster.
removing the Load balancer and use the cluster directly
starting lb and replication servers in Debug mode.
In Debug mode both LB and Replication servers are spitting out a lot of messages but I saw nothing that is alarming…
I am very new to Pgcluster and hence would like to know if any of you have implemented a Geoserver/Pgcluster environment. Any advise would be greatly appreciated.
– Hari Gangadharan
Get the power of Windows + Web with the new Windows Live. Get it now!
(this cross-posted to pgcluster-general and geoserver-users lists)
For the last few days I am trying my best to implement a Geoserver that uses PGCluster as the backend. With just postgis everything is fine but with PGCluster, many database connections gets hung. Sometime I see lot of "idle in transaction" threads. I have two physical servers, each running a replicator, load balancer and cluster.
Hari,
GeoServer 1.5.x (and earlier) used an home grown connection pooling
library with a few serious issues:
* connection leak under high concurrency
* no ability to limit the number of connections around (if at some
point your application used 50 connections the app would keep
them forever)
* if one connection in the pool was dead (because of a timeout or
a network issue) the pool would try to use it once more before
removing it
Starting with GeoServer 1.6.x we do use DBCP as the connection
pooling library and all of the above issues should be gone.
So I'd urge you to test out 1.6.x before looking further into
the problems you're experiencing.
We have done more research on this and found that the JDBC connections get stuck if a connection is opened against Pgcluster with autoCommit = false. To make Geoserver work with PGCluster, we have to make sure that Geoserver opens all connections with autoCommit = true (probably dangerous, but that seems to be the only work around).
We currently made the following change to accomplish that:
geotools / plugin / postgis
org.geotools.data.postgis.PostgisDataStore
Added a new overridden method
protected boolean requireAutoCommit() {
return true;
}
With this change Geoserver seems to be working OK with Pgcluster.
– Hari Gangadharan
Date: Fri, 28 Dec 2007 19:26:27 +0100
From: aaime@anonymised.com
To: hari_g@anonymised.com
Subject: Re: [Geoserver-users] Geoserver 1.5.3 / PGCluster 1.7 seems to be stuck in deadlocks
Hari G ha scritto:
Andrea -
Thanks for the reply. The issue seems to be with replication side… but
something in Geoserver is making it worse. Another java application we
have is not getting into deadlocks even if we increase the threads. The
PGCluster is replicating the selects from geoserver but is not
replicating the selects from other Java application. Is there any
special way the database queries are done in Geoserver? I would like to
take a look. Can you point me to the classes/modules that construct /
dispatch the queries to database server?
The queries are plain ones, we’re not even using prepared statements.
The difference is that we’re using spatial indexes I guess?
If you raise the logging level you’ll be able to see the queries
going to the db.
As for the code that’s building the query, I would not venture
there… it’s old and crusty, but in the end it just builds
the sql you can see in the logs…
Cheers
Andrea
Put your friends on the big screen with Windows Vista® + Windows Live™. Start now!
We have done more research on this and found that the JDBC connections get stuck if a connection is opened against Pgcluster with autoCommit = false. To make Geoserver work with PGCluster, we have to make sure that Geoserver opens all connections with autoCommit = true (probably dangerous, but that seems to be the only work around).
We currently made the following change to accomplish that:
geotools / plugin / postgis
org.geotools.data.postgis.PostgisDataStore
Added a new overridden method
protected boolean requireAutoCommit() {
return true;
}
With this change Geoserver seems to be working OK with Pgcluster.
Intersting to know. You also told me before that another way to make
GeoServer work with PgCluster is to use prepared statements, right?
The approach you're proposing now could be turned into a datastore
parameter, but I'm hesitant because enabling autocommit would break WFS-T transactions big time.
Prepared statements is a solution we'll eventually have to take
for a number of reasons (security, improved performance if we cache them), thought it requires quite a change in jdbc based datastores so I'm not sure when that will be available in a GeoServer release...
Justin, you're the postgis datastore maintainer. Opinions?
My initial theory of “using prepared selects may solve” was not correct. Making selects prepared would not help. The problem all along was with the autoCommit = false. IMHO, the WFS-T may not be affected since those connections are already opened as autoCommit = true. This again was what I saw when I ran geoserver in Tomcat debug mode. I apologize if I am wrong.
We have done more research on this and found that the JDBC connections
get stuck if a connection is opened against Pgcluster with autoCommit =
false. To make Geoserver work with PGCluster, we have to make sure that
Geoserver opens all connections with autoCommit = true (probably
dangerous, but that seems to be the only work around).
We currently made the following change to accomplish that:
geotools / plugin / postgis
org.geotools.data.postgis.PostgisDataStore
Added a new overridden method
protected boolean requireAutoCommit() {
return true;
}
With this change Geoserver seems to be working OK with Pgcluster.
Intersting to know. You also told me before that another way to make
GeoServer work with PgCluster is to use prepared statements, right?
The approach you’re proposing now could be turned into a datastore
parameter, but I’m hesitant because enabling autocommit would break
WFS-T transactions big time.
Prepared statements is a solution we’ll eventually have to take
for a number of reasons (security, improved performance if we cache
them), thought it requires quite a change in jdbc based datastores so
I’m not sure when that will be available in a GeoServer release…
Justin, you’re the postgis datastore maintainer. Opinions?
Cheers
Andrea
Make distant family not so distant with Windows Vista® + Windows Live™. Start now!
My initial theory of "using prepared selects may solve" was not correct. Making selects prepared would not help. The problem all along was with the autoCommit = false. IMHO, the WFS-T may not be affected since those connections are already opened as autoCommit = true. This again was what I saw when I ran geoserver in Tomcat debug mode. I apologize if I am wrong.
A transaction would not be able to work in autocommit = true mode... if
anything fails we would loose the atomicity of the wfs transaction.
If what you've seen is true, we have a very serious bug around. Can you
provide more information about that?
The approach you're proposing now could be turned into a datastore
parameter, but I'm hesitant because enabling autocommit would break
WFS-T transactions big time.
Prepared statements is a solution we'll eventually have to take
for a number of reasons (security, improved performance if we cache
them), thought it requires quite a change in jdbc based datastores so
I'm not sure when that will be available in a GeoServer release...
Justin, you're the postgis datastore maintainer. Opinions?
Should we not be creating connections with autoCommit = true when the
user specifies Transaction.AUTO_COMMIT? Only set it to false when they
pass in an a real transaction?
The approach you're proposing now could be turned into a datastore
parameter, but I'm hesitant because enabling autocommit would break WFS-T transactions big time.
Prepared statements is a solution we'll eventually have to take
for a number of reasons (security, improved performance if we cache them), thought it requires quite a change in jdbc based datastores so I'm not sure when that will be available in a GeoServer release...
Justin, you're the postgis datastore maintainer. Opinions?
Should we not be creating connections with autoCommit = true when the
user specifies Transaction.AUTO_COMMIT? Only set it to false when they
pass in an a real transaction?
Well, outside of a real transaction it should not make a difference.
There is a catch thought. The nice FeatureCollection wrapping performed
around GeoTools 2.2 something had the side effect that no matter what you do when you grab a feature iterator, a feature writer is created
because you may alter the features during iteration (even if you just
want to read).
I would need to check the code, but it's possible that a real
transaction is created (not sure, I'm just making guesses based on
the fact that our reads are not treated as such by the lower datastore
levels).
My initial theory of "using prepared selects may solve" was not correct. Making selects prepared would not help. The problem all along was with the autoCommit = false. IMHO, the WFS-T may not be affected since those connections are already opened as autoCommit = true. This again was what I saw when I ran geoserver in Tomcat debug mode. I apologize if I am wrong.
A transaction would not be able to work in autocommit = true mode... if
anything fails we would loose the atomicity of the wfs transaction.
If what you've seen is true, we have a very serious bug around. Can you
provide more information about that?
I think (hope?) that he's just saying on their modified version autocommit = true was there and 'working' with WFS-T. I don't think he understood that when we say 'break wfs-t transactions big time' that we didn't mean that it won't work, just that it won't retain atomicity. It should be pointed out that it will work fine in all cases except when someone is doing multiple operations in one transaction and one of them fails.
Is that what you're saying Hari? That you modified the code so autocommit = true and you did wfs-t requests that worked fine?
All we're saying is that we're hesitant to commit that change to trunk, since people won't be aware of the risks. For your code you just need to be sure that it only does one operation per transaction.
Sorry for the late response… I did not notice this message since it went to my junk mail folder.
When I read my my mail again, I could understand the ambiguity. When I wrote the earlier mail, I meant that even if I do not force the autocommit as true, it was already set to true by the existing code when it is a forWrite (WFS-T). But, I apologize, my statement was not correct… When I closely examined the code, it seems like setting the autocommit = true will break the atomicity of the transaction. The rollback is also called only if the (transaction != Transaction.AUTO_COMMIT) - org.geotools.data.jdbc.JDBCUtils.java. Hence with my change the WMS, WFS, WFS-T works but WFS-T multi-feature commit will not maintain the atomicity.
Probably the other work around to have transaction atomicity is to fix JDBCUtils so that the rollback is performed whenever we encounter Exceptions. It may not be a great idea to try these changes if no one else is interested in Geoserver / Pgcluster configuration. I have only two days of experience in Geotools but I will put some time to research further on how we can maintain ACID transactions even using autocommit = true. I have posted this Pgcluster issue in the Pgcluster user group also - let me see if anybody picks up that thread.
Any comments/direction would be greatly appreciated.
My initial theory of “using prepared selects may solve” was not correct.
Making selects prepared would not help. The problem all along was with
the autoCommit = false. IMHO, the WFS-T may not be affected since those
connections are already opened as autoCommit = true. This again was what
I saw when I ran geoserver in Tomcat debug mode. I apologize if I am wrong.
A transaction would not be able to work in autocommit = true mode… if
anything fails we would loose the atomicity of the wfs transaction.
If what you’ve seen is true, we have a very serious bug around. Can you
provide more information about that?
I think (hope?) that he’s just saying on their modified version
autocommit = true was there and ‘working’ with WFS-T. I don’t think he
understood that when we say ‘break wfs-t transactions big time’ that we
didn’t mean that it won’t work, just that it won’t retain atomicity. It
should be pointed out that it will work fine in all cases except when
someone is doing multiple operations in one transaction and one of them
fails.
Is that what you’re saying Hari? That you modified the code so
autocommit = true and you did wfs-t requests that worked fine?
All we’re saying is that we’re hesitant to commit that change to trunk,
since people won’t be aware of the risks. For your code you just need
to be sure that it only does one operation per transaction.
Hari G ha scritto:
...
> But, I apologize, my statement was not
correct... When I closely examined the code, it seems like setting the autocommit = true will break the atomicity of the transaction. The rollback is also called only if the (transaction != Transaction.AUTO_COMMIT) - org.geotools.data.jdbc.JDBCUtils.java. Hence with my change the WMS, WFS, WFS-T works but WFS-T multi-feature commit will *not* maintain the atomicity.
Hum, as far as I know it's just not that. When dealing with jdbc databases we entrust the database for managing the transaction atomicity, since that's one of their primary functions.
Even if you change JDBCUtils, you'll loose atomicity anyways. All update/delete/insert commands going to the database in autocommit
= true mode are executed in their own transaction, calling rollback
after them has no effect, the changes are already persisted and
visible to all other users.
Anyways, your patch is ok for anything that is read only, or that
does not need transaction atomicity on writes.