[Geoserver-devel] hibernate configuration module status

Hi all,

Over the past week I have been working on the hibernate configuration community module. I wanted to send a quick status update to let people know where things are at.

So first off my goal here is to get the db backed configuration working with GeoServer as seamlessly as possible. If there is anything that I learned from the configuration changeover that happened in 2.0 it is that backwards compatibility needs to be paramout. So I set out with the goal that no test cases and no client code should have to change in order to work with the new catalog and config. Nice in theory :slight_smile: but in practice some code does have to change, but that is places where tests or client code make bad assumptions. But all in all these cases are very few.

In order to verify this I hacked up locally a testing configuration that allowed me to run all the geoserver integration tests (tests that extend from GeoServerTestSupport) against the hibernate backed config rather than the classic in memory one. This was a lot of work to setup but it really paid off since as you can imagine it fleshed out a lot of bugs with the db config. And I am happy to report that all the test cases pass!! I also ran all cite tests successfully.

To get to this point I picked up the hibernate module where Emanuele left it off and completed the refactor we discussed. The refactor in question was to come up with a dao interface for the catalog and config and have a single catalog/config implementation. I took Emanuele’s initial work and moved as much logic out of the daos and into the catalog/config as i could. This was done for (a) backwards compatibility reasons, to keep the same logic as before regardless of the dao and for (b) maintainability reasons, to keep that logic in a single central place.

I also removed any customized beans for hibernate and got around those issues other ways. Having custom beans for hibernate was problematic in that it leaked out bean implementation details and creation logic over the codebase, the worst case being in the xstream persistence and restconfig. So i opted to make changes to the core beans themselves when needed and workaround the issues with other methods. Now i am not saying that at some point we won’t need custom hibernate beans but I think it should be a last ditch effort.

Finally the changes are available in my geoserver github repo:

http://github.com/jdeolive/geoserver/tree/catalog_dao

Now all the above is great but it is only level 0. I am taking a “first make it work and then make it fast and scalable” approach. With a working implementation the plan is now to stress test it since the whole point of using a db backend is to be able to scale up to millions of layers. This part however will also involve changes to client code since there are many places in the code that assume the entire catalog lives in memory.

That said I would like to get the current changes committed and possibly start getting some testing from other devs and those of our users who are eager and brave.

Thoughts? Comments? Objections?

-Justin


Justin Deoliveira
OpenGeo - http://opengeo.org
Enterprise support for open source geospatial.

Ciao Justin,
I have had an email in draft for a week now about the Hib topic, so I
better start answering to this one.

A few notes:

-1- emanuele is not going to be able to spend time on this very soon,
but we were planning to back to this topic towards the end of the year
-2- I am not sure I will have enough time to check the code myself and
anyway I am not sure I can be of any help anymore :).

However, you need to be aware that the real problem with having a
db-based config for the geoserver
is the assumption that is made all over the codebase that things sits
in memory. I am sure we can find a
workaround that would not require to change all the code that interact
with the catalog but that migh sound like a hack.
In our view the catalog implementation should be more pluggable and
should address for searching and paging right at the core.

As I said, I have not looked at the code you have put together, but
what I wanted to ask is as follows:

- Can we see some design/idea/anything about what you want to do? We
have spent quite some time/money on the Hib prototype and I would make
sure it is being evolved in the right direction before we pick it up
again for our objectives
- What is the effort that your mandate allow to spend on this Hib
thing? I doubt we can solve this problem once for all in a couple of
weeks, providing also that emanuele's feedback will be almost absent.
As an instance, we had to almost avoid lazy loading with the Hib
catalog in order to make it usable with the UI withouth changing the
UI itself, thereore the model and config might need to be tweaked a
bit, etc. etc.

Ciao,
Simone.
-------------------------------------------------------

Notice that our office phone number has recently changed!
Please, update your records!

Ing. Simone Giannecchini
GeoSolutions S.A.S.
Founder
Via Poggio alle Viti 1187
55054 Massarosa (LU)
Italy

phone: +39 0584962313
fax: +39 0584962313
mob: +39 333 8128928

http://www.geo-solutions.it
http://geo-solutions.blogspot.com/
http://www.linkedin.com/in/simonegiannecchini
http://twitter.com/simogeo

-------------------------------------------------------

On Thu, Sep 30, 2010 at 2:07 AM, Justin Deoliveira <jdeolive@anonymised.com> wrote:

Hi all,
Over the past week I have been working on the hibernate configuration
community module. I wanted to send a quick status update to let people know
where things are at.
So first off my goal here is to get the db backed configuration working with
GeoServer as seamlessly as possible. If there is anything that I learned
from the configuration changeover that happened in 2.0 it is that
backwards compatibility needs to be paramout. So I set out with the goal
that no test cases and no client code should have to change in order to work
with the new catalog and config. Nice in theory :slight_smile: but in practice some code
does have to change, but that is places where tests or client code make bad
assumptions. But all in all these cases are very few.
In order to verify this I hacked up locally a testing configuration that
allowed me to run all the geoserver integration tests (tests that extend
from GeoServerTestSupport) against the hibernate backed config rather than
the classic in memory one. This was a lot of work to setup but it really
paid off since as you can imagine it fleshed out a lot of bugs with the db
config. And I am happy to report that all the test cases pass!! I also ran
all cite tests successfully.
To get to this point I picked up the hibernate module where Emanuele left it
off and completed the refactor we discussed. The refactor in question was to
come up with a dao interface for the catalog and config and have a single
catalog/config implementation. I took Emanuele's initial work and moved as
much logic out of the daos and into the catalog/config as i could. This was
done for (a) backwards compatibility reasons, to keep the same logic as
before regardless of the dao and for (b) maintainability reasons, to keep
that logic in a single central place.
I also removed any customized beans for hibernate and got around those
issues other ways. Having custom beans for hibernate was problematic in that
it leaked out bean implementation details and creation logic over the
codebase, the worst case being in the xstream persistence and restconfig. So
i opted to make changes to the core beans themselves when needed and
workaround the issues with other methods. Now i am not saying that at some
point we won't need custom hibernate beans but I think it should be a last
ditch effort.
Finally the changes are available in my geoserver github repo:
http://github.com/jdeolive/geoserver/tree/catalog_dao
Now all the above is great but it is only level 0. I am taking a "first make
it work and then make it fast and scalable" approach. With a working
implementation the plan is now to stress test it since the whole point of
using a db backend is to be able to scale up to millions of layers. This
part however will also involve changes to client code since there are many
places in the code that assume the entire catalog lives in memory.
That said I would like to get the current changes committed and possibly
start getting some testing from other devs and those of our users who are
eager and brave.
Thoughts? Comments? Objections?
-Justin
--
Justin Deoliveira
OpenGeo - http://opengeo.org
Enterprise support for open source geospatial.

------------------------------------------------------------------------------
Start uncovering the many advantages of virtual appliances
and start using them to simplify application deployment and
accelerate your shift to cloud computing.
http://p.sf.net/sfu/novell-sfdev2dev
_______________________________________________
Geoserver-devel mailing list
Geoserver-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/geoserver-devel

Are there instructions anywhere for running against the Hibernate backend?

-d

On Thu, Sep 30, 2010 at 6:35 AM, Simone Giannecchini <simone.giannecchini@anonymised.com> wrote:

Ciao Justin,
I have had an email in draft for a week now about the Hib topic, so I
better start answering to this one.

A few notes:

-1- emanuele is not going to be able to spend time on this very soon,
but we were planning to back to this topic towards the end of the year
-2- I am not sure I will have enough time to check the code myself and
anyway I am not sure I can be of any help anymore :).

However, you need to be aware that the real problem with having a
db-based config for the geoserver
is the assumption that is made all over the codebase that things sits
in memory. I am sure we can find a
workaround that would not require to change all the code that interact
with the catalog but that migh sound like a hack.
In our view the catalog implementation should be more pluggable and
should address for searching and paging right at the core.

As I said, I have not looked at the code you have put together, but
what I wanted to ask is as follows:

  • Can we see some design/idea/anything about what you want to do? We
    have spent quite some time/money on the Hib prototype and I would make
    sure it is being evolved in the right direction before we pick it up
    again for our objectives
  • What is the effort that your mandate allow to spend on this Hib
    thing? I doubt we can solve this problem once for all in a couple of
    weeks, providing also that emanuele’s feedback will be almost absent.
    As an instance, we had to almost avoid lazy loading with the Hib
    catalog in order to make it usable with the UI withouth changing the
    UI itself, thereore the model and config might need to be tweaked a
    bit, etc. etc.

Ciao,
Simone.

===
Notice that our office phone number has recently changed!
Please, update your records!

Ing. Simone Giannecchini
GeoSolutions S.A.S.
Founder
Via Poggio alle Viti 1187
55054 Massarosa (LU)
Italy

phone: +39 0584962313
fax: +39 0584962313
mob: +39 333 8128928

http://www.geo-solutions.it
http://geo-solutions.blogspot.com/
http://www.linkedin.com/in/simonegiannecchini
http://twitter.com/simogeo


On Thu, Sep 30, 2010 at 2:07 AM, Justin Deoliveira <jdeolive@anonymised.com> wrote:

Hi all,
Over the past week I have been working on the hibernate configuration
community module. I wanted to send a quick status update to let people know
where things are at.
So first off my goal here is to get the db backed configuration working with
GeoServer as seamlessly as possible. If there is anything that I learned
from the configuration changeover that happened in 2.0 it is that
backwards compatibility needs to be paramout. So I set out with the goal
that no test cases and no client code should have to change in order to work
with the new catalog and config. Nice in theory :slight_smile: but in practice some code
does have to change, but that is places where tests or client code make bad
assumptions. But all in all these cases are very few.
In order to verify this I hacked up locally a testing configuration that
allowed me to run all the geoserver integration tests (tests that extend
from GeoServerTestSupport) against the hibernate backed config rather than
the classic in memory one. This was a lot of work to setup but it really
paid off since as you can imagine it fleshed out a lot of bugs with the db
config. And I am happy to report that all the test cases pass!! I also ran
all cite tests successfully.
To get to this point I picked up the hibernate module where Emanuele left it
off and completed the refactor we discussed. The refactor in question was to
come up with a dao interface for the catalog and config and have a single
catalog/config implementation. I took Emanuele’s initial work and moved as
much logic out of the daos and into the catalog/config as i could. This was
done for (a) backwards compatibility reasons, to keep the same logic as
before regardless of the dao and for (b) maintainability reasons, to keep
that logic in a single central place.
I also removed any customized beans for hibernate and got around those
issues other ways. Having custom beans for hibernate was problematic in that
it leaked out bean implementation details and creation logic over the
codebase, the worst case being in the xstream persistence and restconfig. So
i opted to make changes to the core beans themselves when needed and
workaround the issues with other methods. Now i am not saying that at some
point we won’t need custom hibernate beans but I think it should be a last
ditch effort.
Finally the changes are available in my geoserver github repo:
http://github.com/jdeolive/geoserver/tree/catalog_dao
Now all the above is great but it is only level 0. I am taking a “first make
it work and then make it fast and scalable” approach. With a working
implementation the plan is now to stress test it since the whole point of
using a db backend is to be able to scale up to millions of layers. This
part however will also involve changes to client code since there are many
places in the code that assume the entire catalog lives in memory.
That said I would like to get the current changes committed and possibly
start getting some testing from other devs and those of our users who are
eager and brave.
Thoughts? Comments? Objections?
-Justin

Justin Deoliveira
OpenGeo - http://opengeo.org
Enterprise support for open source geospatial.


Start uncovering the many advantages of virtual appliances
and start using them to simplify application deployment and
accelerate your shift to cloud computing.
http://p.sf.net/sfu/novell-sfdev2dev


Geoserver-devel mailing list
Geoserver-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/geoserver-devel


Start uncovering the many advantages of virtual appliances
and start using them to simplify application deployment and
accelerate your shift to cloud computing.
http://p.sf.net/sfu/novell-sfdev2dev


Geoserver-devel mailing list
Geoserver-devel@anonymised.comsts.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/geoserver-devel

Ciao Simone, some answers/comments inline.

On Thu, Sep 30, 2010 at 4:35 AM, Simone Giannecchini <simone.giannecchini@anonymised.com1268…> wrote:

Ciao Justin,
I have had an email in draft for a week now about the Hib topic, so I
better start answering to this one.

A few notes:

-1- emanuele is not going to be able to spend time on this very soon,
but we were planning to back to this topic towards the end of the year
-2- I am not sure I will have enough time to check the code myself and
anyway I am not sure I can be of any help anymore :).

Well Emanuele did great work with what was there. I have really just been tweaking things here and there. I thank him for the hard work and effort :slight_smile:

However, you need to be aware that the real problem with having a
db-based config for the geoserver
is the assumption that is made all over the codebase that things sits
in memory. I am sure we can find a
workaround that would not require to change all the code that interact
with the catalog but that migh sound like a hack.
In our view the catalog implementation should be more pluggable and
should address for searching and paging right at the core.

I could not agree more. The current catalog api has not been designed with this in mind. So in my mind the next step (now that we have something working) is to evolve the catalog api in a direction that accommodate this things. Doing a standard deprecation of methods that assume a memory based catalog and replacing them with methods that support paging/filtering/etc…

As I said, I have not looked at the code you have put together, but
what I wanted to ask is as follows:

  • Can we see some design/idea/anything about what you want to do? We
    have spent quite some time/money on the Hib prototype and I would make
    sure it is being evolved in the right direction before we pick it up
    again for our objectives

Well the basic idea would be to come up with some Query object that supported everything we want. Paging, filtering, etc… And then change methods that look like this:

List getLayers()
List getLayersByXYZ(…)

With methods like:

List getLayers(Query)

Or perhaps

void visitLayers(Query, CatalogVisitor)

Essentially the same pattern as geotools datastores. The way this would go in my mind is:

  1. update dao api to support the query paradigm
  2. update catalog implementation to use the new methods
  3. update the catalog api to expose the query methods
  • What is the effort that your mandate allow to spend on this Hib
    thing? I doubt we can solve this problem once for all in a couple of
    weeks, providing also that emanuele’s feedback will be almost absent.
    As an instance, we had to almost avoid lazy loading with the Hib
    catalog in order to make it usable with the UI withouth changing the
    UI itself, thereore the model and config might need to be tweaked a
    bit, etc. etc.

While I can’t comment for sure we do have a client interested in using the db catalog to set up a geoserver with thousands of layers. And I think the contract allocates a lot of time to this. But that said I agree with you that this won’t be done in one shot. It will have to be iterative. Unfortunately what has been done to date is only level 0. Still a number of levels to go. A natural path in my mind looks like:

  1. Get db backend with current catalog api
  2. Update DAO interface sto support querying
  3. Reimplement current catalog methods in terms of new dao query api
  4. Expose new catalog api based on Query and deprecate old methods
  5. Update client code (services, ui, restconfig, etc…) to use the new api

(4) will require an amount of effort comparable to what it took to move to the new catalog and config apis for 2.0. Something for a code sprint :slight_smile:

About the lazy loading issue in particular yes that is an issue. In the current configuration nothing is loaded lazily. This is the next thing I plan to look into. I only looked briefly and found some configuration settings to control this in the entity manager… but did not get very far.

-Justin

Ciao,
Simone.

===
Notice that our office phone number has recently changed!
Please, update your records!

Ing. Simone Giannecchini
GeoSolutions S.A.S.
Founder
Via Poggio alle Viti 1187
55054 Massarosa (LU)
Italy

phone: +39 0584962313
fax: +39 0584962313
mob: +39 333 8128928

http://www.geo-solutions.it
http://geo-solutions.blogspot.com/
http://www.linkedin.com/in/simonegiannecchini
http://twitter.com/simogeo


On Thu, Sep 30, 2010 at 2:07 AM, Justin Deoliveira <jdeolive@anonymised.com> wrote:

Hi all,
Over the past week I have been working on the hibernate configuration
community module. I wanted to send a quick status update to let people know
where things are at.
So first off my goal here is to get the db backed configuration working with
GeoServer as seamlessly as possible. If there is anything that I learned
from the configuration changeover that happened in 2.0 it is that
backwards compatibility needs to be paramout. So I set out with the goal
that no test cases and no client code should have to change in order to work
with the new catalog and config. Nice in theory :slight_smile: but in practice some code
does have to change, but that is places where tests or client code make bad
assumptions. But all in all these cases are very few.
In order to verify this I hacked up locally a testing configuration that
allowed me to run all the geoserver integration tests (tests that extend
from GeoServerTestSupport) against the hibernate backed config rather than
the classic in memory one. This was a lot of work to setup but it really
paid off since as you can imagine it fleshed out a lot of bugs with the db
config. And I am happy to report that all the test cases pass!! I also ran
all cite tests successfully.
To get to this point I picked up the hibernate module where Emanuele left it
off and completed the refactor we discussed. The refactor in question was to
come up with a dao interface for the catalog and config and have a single
catalog/config implementation. I took Emanuele’s initial work and moved as
much logic out of the daos and into the catalog/config as i could. This was
done for (a) backwards compatibility reasons, to keep the same logic as
before regardless of the dao and for (b) maintainability reasons, to keep
that logic in a single central place.
I also removed any customized beans for hibernate and got around those
issues other ways. Having custom beans for hibernate was problematic in that
it leaked out bean implementation details and creation logic over the
codebase, the worst case being in the xstream persistence and restconfig. So
i opted to make changes to the core beans themselves when needed and
workaround the issues with other methods. Now i am not saying that at some
point we won’t need custom hibernate beans but I think it should be a last
ditch effort.
Finally the changes are available in my geoserver github repo:
http://github.com/jdeolive/geoserver/tree/catalog_dao
Now all the above is great but it is only level 0. I am taking a “first make
it work and then make it fast and scalable” approach. With a working
implementation the plan is now to stress test it since the whole point of
using a db backend is to be able to scale up to millions of layers. This
part however will also involve changes to client code since there are many
places in the code that assume the entire catalog lives in memory.
That said I would like to get the current changes committed and possibly
start getting some testing from other devs and those of our users who are
eager and brave.
Thoughts? Comments? Objections?
-Justin

Justin Deoliveira
OpenGeo - http://opengeo.org
Enterprise support for open source geospatial.


Start uncovering the many advantages of virtual appliances
and start using them to simplify application deployment and
accelerate your shift to cloud computing.
http://p.sf.net/sfu/novell-sfdev2dev


Geoserver-devel mailing list
Geoserver-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/geoserver-devel


Justin Deoliveira
OpenGeo - http://opengeo.org
Enterprise support for open source geospatial.

Well the current hibernate module you can engage via profile, -P hibernate. However the stuff i have been doing is not in the svn repo yet. But the same applies it is engagable via a profile. In the version in my git repo i renamed the module to “dbconfig” as “hibernate” seemed a bit too generic. Also soon we will require a generic hibernate module i think since we have dbconfig and monitoring both using hibernate and it would be nice to factor out common code.

On Thu, Sep 30, 2010 at 7:59 AM, David Winslow <dwinslow@anonymised.com> wrote:

Are there instructions anywhere for running against the Hibernate backend?

-d

On Thu, Sep 30, 2010 at 6:35 AM, Simone Giannecchini <simone.giannecchini@anonymised.com> wrote:

Ciao Justin,
I have had an email in draft for a week now about the Hib topic, so I
better start answering to this one.

A few notes:

-1- emanuele is not going to be able to spend time on this very soon,
but we were planning to back to this topic towards the end of the year
-2- I am not sure I will have enough time to check the code myself and
anyway I am not sure I can be of any help anymore :).

However, you need to be aware that the real problem with having a
db-based config for the geoserver
is the assumption that is made all over the codebase that things sits
in memory. I am sure we can find a
workaround that would not require to change all the code that interact
with the catalog but that migh sound like a hack.
In our view the catalog implementation should be more pluggable and
should address for searching and paging right at the core.

As I said, I have not looked at the code you have put together, but
what I wanted to ask is as follows:

  • Can we see some design/idea/anything about what you want to do? We
    have spent quite some time/money on the Hib prototype and I would make
    sure it is being evolved in the right direction before we pick it up
    again for our objectives
  • What is the effort that your mandate allow to spend on this Hib
    thing? I doubt we can solve this problem once for all in a couple of
    weeks, providing also that emanuele’s feedback will be almost absent.
    As an instance, we had to almost avoid lazy loading with the Hib
    catalog in order to make it usable with the UI withouth changing the
    UI itself, thereore the model and config might need to be tweaked a
    bit, etc. etc.

Ciao,
Simone.

===
Notice that our office phone number has recently changed!
Please, update your records!

Ing. Simone Giannecchini
GeoSolutions S.A.S.
Founder
Via Poggio alle Viti 1187
55054 Massarosa (LU)
Italy

phone: +39 0584962313
fax: +39 0584962313
mob: +39 333 8128928

http://www.geo-solutions.it
http://geo-solutions.blogspot.com/
http://www.linkedin.com/in/simonegiannecchini
http://twitter.com/simogeo


On Thu, Sep 30, 2010 at 2:07 AM, Justin Deoliveira <jdeolive@anonymised.com> wrote:

Hi all,
Over the past week I have been working on the hibernate configuration
community module. I wanted to send a quick status update to let people know
where things are at.
So first off my goal here is to get the db backed configuration working with
GeoServer as seamlessly as possible. If there is anything that I learned
from the configuration changeover that happened in 2.0 it is that
backwards compatibility needs to be paramout. So I set out with the goal
that no test cases and no client code should have to change in order to work
with the new catalog and config. Nice in theory :slight_smile: but in practice some code
does have to change, but that is places where tests or client code make bad
assumptions. But all in all these cases are very few.
In order to verify this I hacked up locally a testing configuration that
allowed me to run all the geoserver integration tests (tests that extend
from GeoServerTestSupport) against the hibernate backed config rather than
the classic in memory one. This was a lot of work to setup but it really
paid off since as you can imagine it fleshed out a lot of bugs with the db
config. And I am happy to report that all the test cases pass!! I also ran
all cite tests successfully.
To get to this point I picked up the hibernate module where Emanuele left it
off and completed the refactor we discussed. The refactor in question was to
come up with a dao interface for the catalog and config and have a single
catalog/config implementation. I took Emanuele’s initial work and moved as
much logic out of the daos and into the catalog/config as i could. This was
done for (a) backwards compatibility reasons, to keep the same logic as
before regardless of the dao and for (b) maintainability reasons, to keep
that logic in a single central place.
I also removed any customized beans for hibernate and got around those
issues other ways. Having custom beans for hibernate was problematic in that
it leaked out bean implementation details and creation logic over the
codebase, the worst case being in the xstream persistence and restconfig. So
i opted to make changes to the core beans themselves when needed and
workaround the issues with other methods. Now i am not saying that at some
point we won’t need custom hibernate beans but I think it should be a last
ditch effort.
Finally the changes are available in my geoserver github repo:
http://github.com/jdeolive/geoserver/tree/catalog_dao
Now all the above is great but it is only level 0. I am taking a “first make
it work and then make it fast and scalable” approach. With a working
implementation the plan is now to stress test it since the whole point of
using a db backend is to be able to scale up to millions of layers. This
part however will also involve changes to client code since there are many
places in the code that assume the entire catalog lives in memory.
That said I would like to get the current changes committed and possibly
start getting some testing from other devs and those of our users who are
eager and brave.
Thoughts? Comments? Objections?
-Justin

Justin Deoliveira
OpenGeo - http://opengeo.org
Enterprise support for open source geospatial.


Start uncovering the many advantages of virtual appliances
and start using them to simplify application deployment and
accelerate your shift to cloud computing.
http://p.sf.net/sfu/novell-sfdev2dev


Geoserver-devel mailing list
Geoserver-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/geoserver-devel


Start uncovering the many advantages of virtual appliances
and start using them to simplify application deployment and
accelerate your shift to cloud computing.
http://p.sf.net/sfu/novell-sfdev2dev


Geoserver-devel mailing list
Geoserver-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/geoserver-devel


Justin Deoliveira
OpenGeo - http://opengeo.org
Enterprise support for open source geospatial.