[Geoserver-devel] Review of catalog dao refactoring patch

Hi Emanuele,

I took an initial look over the catalog refactoring patch and here is what I came up with.

First off great work. I know this is a big effort and I think you for taking the lead on this. It is great to see things moving in the right direction.

* CatalogDAO

The first thing I noticed and something that actually made the patch kind of hard to review was that method names have changed:

   Catalog.add --> CatalogDAO.save
   Catalog.save --> CatalogDAO.update
   Catalog.remove --> CatalogDAO.delete

Why the changes? Any reason not to keep the names the same. Developers who are already used to the existing method naming scheme will probably get confused as I did.

Also, save and update are sort of ambiguous. If we do stick with a new naming scheme I would like to see save be renamed something more unambigious like add or create, something that explicitly indicates that this is a new resource being added.

There are also no javadocs. This is going to be a very central piece of api so I would like to see something. Again if the same naming scheme was kept the javadocs from Catalog should be able to be carried across verbatim.

* InMemoryCatalogDAO

Rename to just MemoryCatalogDAO

I notice that this class also has no notion of proxying that is occurring. This seems odd that this is still the responsibility of CatalogImpl. The whole point of this excercise is to have a single CatalogImpl and multiple daos for different "storage formats".

Since using proxy is an implementation detail of a memory based catalog it should be totally encapsulated by that dao implementation. Ie we will not want to proxy objects coming from the hibernate dao.

* LayerGroupInfoImpl

In getStyles() i see some code for lazily instantiating the styles list. This pattern is not really used throughout the catalog and configuration beans. As an alternative I think i would rather see either:

a) XStreamPersister ensure that the styles list is property initializes
b) implement a readResolve() method that will do the instantiation

* CatalogWriter, LegacyCatalogReader

I notice that the patch modifies the storage format to put a "data" element directly under the "catalog" root element. What is the purpose of this? The whole point of these classes is to maintain backwards compatability with old 1.x style data directories. I am not sure why it would change.

And that is it for now :slight_smile:

-Justin

--
Justin Deoliveira
OpenGeo - http://opengeo.org
Enterprise support for open source geospatial.

Ciao Justin,

I took an initial look over the catalog refactoring patch and here is
what I came up with.

Thanks a lot for the review.

* CatalogDAO

The first thing I noticed and something that actually made the patch
kind of hard to review was that method names have changed:

   Catalog.add --> CatalogDAO.save
   Catalog.save --> CatalogDAO.update
   Catalog.remove --> CatalogDAO.delete

Why the changes?

Oops, my error. I just took as a skeleton the DAO I made for the hibernate
catalog, and the names were more or less following SQL conventions. Names can
easily be reverted into the standard used in the catalog.

* InMemoryCatalogDAO

Rename to just MemoryCatalogDAO

I notice that this class also has no notion of proxying that is
occurring. This seems odd that this is still the responsibility of
CatalogImpl. The whole point of this excercise is to have a single
CatalogImpl and multiple daos for different "storage formats".

Since using proxy is an implementation detail of a memory based catalog
it should be totally encapsulated by that dao implementation. Ie we will
not want to proxy objects coming from the hibernate dao.

This is a good point of discussion.
I kept the ModProxing into the CatalogImpl in order to get easily back the old
values when notifying the updates to listeners.
Another way to solve this issue would be to get the old values from the DAO
(may be an expensive operation), and then making a comparison of the
contents.
Or, yet another option, would be not to notify listeners with new and old
values, and leave to the interested listeners the job of retrieving the old
values from the Catalog (I guess not all the listeners are interested in
them). A caching mechanism could be implemented so that such a loading is
only performed once by the first listener in the chain requesting for it.
Suggestions are welcome :slight_smile:

* LayerGroupInfoImpl

In getStyles() i see some code for lazily instantiating the styles list.
This pattern is not really used throughout the catalog and configuration
beans. As an alternative I think i would rather see either:

a) XStreamPersister ensure that the styles list is property initializes
b) implement a readResolve() method that will do the instantiation

OK, I wrote in the comments (jira + comments in the code) this was a tmp
workaround to be discussed.
All the resolve() stuff has to be refactored a bit. If it only concerns the
XStreamPersister logic, we could move the resolving logic there.

* CatalogWriter, LegacyCatalogReader

I notice that the patch modifies the storage format to put a "data"
element directly under the "catalog" root element. What is the purpose
of this? The whole point of these classes is to maintain backwards
compatability with old 1.x style data directories. I am not sure why it
would change.

Well, as noted in the comment, I had to modify the XStreamPersister to make
the new split work. Now, some tests uses the CatalogWriter to write files
that will be read by the XStreamPersister, so I had to change it as well.
The result is that XStreamPersister and CatalogWriter can write compatible
formats.
In the same way, LegacyCatalogReader reads file created by the them, so I had
to make it a bit lenient and also accept the new format.
Of course these classes will be modified again to make them work with
pluggable DAOs.

I you have time, let's talk a bit about it all on IRC.

   Ciao,
   Emanuele

Ciao Emanuele,

On 2/12/10 10:02 AM, Emanuele Tajariol wrote:

Ciao Justin,

I took an initial look over the catalog refactoring patch and here is
what I came up with.

Thanks a lot for the review.

* CatalogDAO

The first thing I noticed and something that actually made the patch
kind of hard to review was that method names have changed:

    Catalog.add --> CatalogDAO.save
    Catalog.save --> CatalogDAO.update
    Catalog.remove --> CatalogDAO.delete

Why the changes?

Oops, my error. I just took as a skeleton the DAO I made for the hibernate
catalog, and the names were more or less following SQL conventions. Names can
easily be reverted into the standard used in the catalog.

* InMemoryCatalogDAO

Rename to just MemoryCatalogDAO

I notice that this class also has no notion of proxying that is
occurring. This seems odd that this is still the responsibility of
CatalogImpl. The whole point of this excercise is to have a single
CatalogImpl and multiple daos for different "storage formats".

Since using proxy is an implementation detail of a memory based catalog
it should be totally encapsulated by that dao implementation. Ie we will
not want to proxy objects coming from the hibernate dao.

This is a good point of discussion.
I kept the ModProxing into the CatalogImpl in order to get easily back the old
values when notifying the updates to listeners.
Another way to solve this issue would be to get the old values from the DAO
(may be an expensive operation), and then making a comparison of the
contents.
Or, yet another option, would be not to notify listeners with new and old
values, and leave to the interested listeners the job of retrieving the old
values from the Catalog (I guess not all the listeners are interested in
them). A caching mechanism could be implemented so that such a loading is
only performed once by the first listener in the chain requesting for it.
Suggestions are welcome :slight_smile:

Right good point, i never thought about that. I guess unless there is a serious downside to having the catalog (and not the dao) do the proxying we can just leave it. So yeah as long as the proxys don't interfere at all with the hibernate implementation i am fine with leaving them as is.

* LayerGroupInfoImpl

In getStyles() i see some code for lazily instantiating the styles list.
This pattern is not really used throughout the catalog and configuration
beans. As an alternative I think i would rather see either:

a) XStreamPersister ensure that the styles list is property initializes
b) implement a readResolve() method that will do the instantiation

OK, I wrote in the comments (jira + comments in the code) this was a tmp
workaround to be discussed.
All the resolve() stuff has to be refactored a bit. If it only concerns the
XStreamPersister logic, we could move the resolving logic there.

* CatalogWriter, LegacyCatalogReader

I notice that the patch modifies the storage format to put a "data"
element directly under the "catalog" root element. What is the purpose
of this? The whole point of these classes is to maintain backwards
compatability with old 1.x style data directories. I am not sure why it
would change.

Well, as noted in the comment, I had to modify the XStreamPersister to make
the new split work. Now, some tests uses the CatalogWriter to write files
that will be read by the XStreamPersister, so I had to change it as well.
The result is that XStreamPersister and CatalogWriter can write compatible
formats.
In the same way, LegacyCatalogReader reads file created by the them, so I had
to make it a bit lenient and also accept the new format.
Of course these classes will be modified again to make them work with
pluggable DAOs.

Ok, but as I said I don't think we have the freedom to change the underlying format, that format is still used a great deal today and we have to maintain it to be backward compatible.

As for the test cases that use CatalogWriter to write and XStreamPErsister to read I actually think they are pointless since that never occurs any more, ie catalog.xml is never read by xstream persister.

If we simply remove that test case I think we can allow the format to remain unchanged?

I you have time, let's talk a bit about it all on IRC.

    Ciao,
    Emanuele

--
Justin Deoliveira
OpenGeo - http://opengeo.org
Enterprise support for open source geospatial.