[Geoserver-devel] review of csw pull request

Hi all,

I spent a few hours today reviewing the pending pull request from Niels that adds support for iso application profile and a catalog store based directly from the geoserver catalog to the existing csw community module.

https://github.com/geoserver/geoserver/pull/84/files

All in all it looks great. Big thanks to Niels and Andrea for all this work. I did however run into a few hiccups when trying to build the modules with tests.

  1. UTF-8 vs ISO-8859-1 encoding issues

A few tests fail with error message complaining about invalid UTF-8 byte sequences. Some of the test files contain some extended latin characters. For instance:

https://github.com/geoserver/geoserver/blob/master/src/community/csw/csw-core/src/test/resources/org/geoserver/csw/records/Record_e9330592-0932-474b-be34-c3a3bb67c7db.xml

The error occurs when one of the tests tries to parse a GetRecord response that contains the record. The resulting doc reports a UTF-8 encoding and the getAsDOM method fails to parse it. Not sure the best way to solve this one but i was able to fix the test with two approaches.

a. Change the record transformer to report a document encoding of ISO-8859-1
b. Specify manually the ISO-8859-1 encoding when parsing the response, essentially ovrerriding the UTF-8 encoding reported on the doc

This is an example of one of the tests that fail.

https://github.com/geoserver/geoserver/blob/master/src/community/csw/csw-core/src/test/java/org/geoserver/csw/GetRecordsTest.java#L243

  1. Location of schema files

The pull request moves the csw schema files from the classpath root to a package prefix of net/opengis so that AppSchemaResolver can find them. But this then causes the classpath publisher to fail to find the and makes it so that the schemas are no longer web accessible which causes a test to fail.

An obvious solution would be to just copy the files back to the root, keeping them in both places. But I wonder if AppSchemaResolver can be configured to load them without the package prefix. Or we could potentially modify ClasspathPublisher and add the ability to add a package prefix.

  1. org.geoserver.csw.records.iso.GetRecordsTest failures

Some of the ISO GetRecords tests fail as they make assertions assuming that the “cite:Forests” record will be in the first page of results. Easy fix seemed to just increase the page size by using maxRecords to ensure all results returned.

And that is it. Aside from those issues the modules build just fine. I have pushed some changes up to my github that contain fixes for these issues:

https://github.com/jdeolive/geoserver/tree/csw

But i am not confident on the fixes for issues 1 and 2 above and would like to hear from Andrea and Niels.

I also took the liberty of fixing some minor issues as well including:

  • cleaned up formatting issues with changes to main module catalog classes (tabs vs spaces)
  • removed csw- prefix from csw modules, as per convention with other multi module roots like web, etc… the directory tree now looks like:

csw/

api/
core/
simple-store/

The artifact ids retain the csw prefix though.

-Justin


Justin Deoliveira
OpenGeo - http://opengeo.org
Enterprise support for open source geospatial.

On Thu, Jan 17, 2013 at 2:10 AM, Justin Deoliveira <jdeolive@anonymised.com> wrote:

Hi all,

I spent a few hours today reviewing the pending pull request from Niels that adds support for iso application profile and a catalog store based directly from the geoserver catalog to the existing csw community module.

https://github.com/geoserver/geoserver/pull/84/files

Hi,
just having a look at it right now.

All in all it looks great. Big thanks to Niels and Andrea for all this work. I did however run into a few hiccups when trying to build the modules with tests.

  1. UTF-8 vs ISO-8859-1 encoding issues

A few tests fail with error message complaining about invalid UTF-8 byte sequences. Some of the test files contain some extended latin characters. For instance:

https://github.com/geoserver/geoserver/blob/master/src/community/csw/csw-core/src/test/resources/org/geoserver/csw/records/Record_e9330592-0932-474b-be34-c3a3bb67c7db.xml

The error occurs when one of the tests tries to parse a GetRecord response that contains the record. The resulting doc reports a UTF-8 encoding and the getAsDOM method fails to parse it. Not sure the best way to solve this one but i was able to fix the test with two approaches.

a. Change the record transformer to report a document encoding of ISO-8859-1
b. Specify manually the ISO-8859-1 encoding when parsing the response, essentially ovrerriding the UTF-8 encoding reported on the doc

This is an example of one of the tests that fail.

https://github.com/geoserver/geoserver/blob/master/src/community/csw/csw-core/src/test/java/org/geoserver/csw/GetRecordsTest.java#L243

Hmm… that file comes straigth from CITE tests data for CSW 2.0.2, did you try running
the CITE tests after the change?
I have no failures due to encoding, but I’m seeing those:

testAllRecords(org.geoserver.csw.store.internal.GetRecordsTest): expected:<[abstract about Forests]> but was:<>
testAllRecords(org.geoserver.csw.records.iso.GetRecordsTest): expected:<[abstract about Forests]> but was:<>
testAllRecordsBrief(org.geoserver.csw.records.iso.GetRecordsTest): expected:<[urn:x-ogc:def:crs:EPSG:6.11:4326]> but was:<>

The first two you talk about later, simple fix, the last one seems new though, did you experience it as well?

  1. Location of schema files

The pull request moves the csw schema files from the classpath root to a package prefix of net/opengis so that AppSchemaResolver can find them. But this then causes the classpath publisher to fail to find the and makes it so that the schemas are no longer web accessible which causes a test to fail.

An obvious solution would be to just copy the files back to the root, keeping them in both places. But I wonder if AppSchemaResolver can be configured to load them without the package prefix. Or we could potentially modify ClasspathPublisher and add the ability to add a package prefix.

Imho it would be best to fix the AppSchemaResolver, since every other schema in GeoServer is already
published without issues with the ClasspathPublisher

  1. org.geoserver.csw.records.iso.GetRecordsTest failures

Some of the ISO GetRecords tests fail as they make assertions assuming that the “cite:Forests” record will be in the first page of results. Easy fix seemed to just increase the page size by using maxRecords to ensure all results returned.

And that is it. Aside from those issues the modules build just fine. I have pushed some changes up to my github that contain fixes for these issues:

https://github.com/jdeolive/geoserver/tree/csw

But i am not confident on the fixes for issues 1 and 2 above and would like to hear from Andrea and Niels.

I also took the liberty of fixing some minor issues as well including:

  • cleaned up formatting issues with changes to main module catalog classes (tabs vs spaces)
  • removed csw- prefix from csw modules, as per convention with other multi module roots like web, etc… the directory tree now looks like:

csw/

api/
core/
simple-store/

The artifact ids retain the csw prefix though.

This works for me.

Here is some more I’ve found during the review.

There’s a change in DefaultCatalogFacade.java that I’m not sure about:

if (null != sortByList) {
for (int i = sortByList.length - 1; i >=0 ; i–) {
SortBy sortBy = sortByList[i];
Ordering ordering = Ordering.from(comparator(sortBy));
if (SortOrder.DESCENDING.equals(sortBy.getSortOrder())) {
ordering = ordering.reverse();
}
all = ordering.sortedCopy(all);
}
}

What guarantees are there that sorting the same list multiple times we’ll get
to a result that is ordered on sortBy1, then sortBy2, and so on?
What I’m trying to say is, the second time we sort the algorithm might as well
ruin the partial ordering established by the first sort… the Java 6 algorithm might
work, but Java 7 has a different sorting algorithm (TimSort), not sure if the
multiple sorts are going to be preserved there.

Anyways, no big deal, I guess it can be handled as a bug later, if it turns
out to misbehave.

What’s a bit more troublesome is changes in the CatalogStore interface.
Mind, I’m not against them per se, but in order to change a interface that was
discussed openly on the ml I would have expected the same opennes, or at
least some explanation on why they happened, instead I’m looking at the
diff trying to understand them.
The changes propagate in other portions as well, it seems bits of the CSW
service code have been refactored, again, without providing any clue as to why.

I’m not saying the changes are right or wrong (on the contraty, I’m confident
they have been done for good reasons), but they should have been
discussed: since the patch is out already, can we have some rationale about them?

It would make reviewing the patch a lot easier.

Cheers
Andrea

==
Our support, Your Success! Visit http://opensdi.geo-solutions.it for more information.

Ing. Andrea Aime
@geowolf
Technical Lead

GeoSolutions S.A.S.
Via Poggio alle Viti 1187
55054 Massarosa (LU)
Italy
phone: +39 0584 962313
fax: +39 0584 1660272
mob: +39 339 8844549

http://www.geo-solutions.it
http://twitter.com/geosolutions_it


On Sat, Jan 19, 2013 at 10:54 AM, Andrea Aime
<andrea.aime@anonymised.com>wrote:

On Thu, Jan 17, 2013 at 2:10 AM, Justin Deoliveira <jdeolive@anonymised.com>wrote:

Hi all,

I spent a few hours today reviewing the pending pull request from Niels
that adds support for iso application profile and a catalog store based
directly from the geoserver catalog to the existing csw community module.

  https://github.com/geoserver/geoserver/pull/84/files

Hi,
just having a look at it right now.

All in all it looks great. Big thanks to Niels and Andrea for all this
work. I did however run into a few hiccups when trying to build the modules
with tests.

1. UTF-8 vs ISO-8859-1 encoding issues

A few tests fail with error message complaining about invalid UTF-8 byte
sequences. Some of the test files contain some extended latin characters.
For instance:

https://github.com/geoserver/geoserver/blob/master/src/community/csw/csw-core/src/test/resources/org/geoserver/csw/records/Record_e9330592-0932-474b-be34-c3a3bb67c7db.xml

The error occurs when one of the tests tries to parse a
GetRecord response that contains the record. The resulting doc reports a
UTF-8 encoding and the getAsDOM method fails to parse it. Not sure the best
way to solve this one but i was able to fix the test with two approaches.

  a. Change the record transformer to report a document encoding
of ISO-8859-1
  b. Specify manually the ISO-8859-1 encoding when parsing the response,
essentially ovrerriding the UTF-8 encoding reported on the doc

This is an example of one of the tests that fail.

https://github.com/geoserver/geoserver/blob/master/src/community/csw/csw-core/src/test/java/org/geoserver/csw/GetRecordsTest.java#L243

Hmm... that file comes straigth from CITE tests data for CSW 2.0.2, did
you try running
the CITE tests after the change?
I have no failures due to encoding, but I'm seeing those:

testAllRecords(org.geoserver.csw.store.internal.GetRecordsTest):
expected:<[abstract about Forests]> but was:<>
  testAllRecords(org.geoserver.csw.records.iso.GetRecordsTest):
expected:<[abstract about Forests]> but was:<>
  testAllRecordsBrief(org.geoserver.csw.records.iso.GetRecordsTest):
expected:<[urn:x-ogc:def:crs:EPSG:6.11:4326]> but was:<>

Yeah, I saw these, I think all three are the paging/ordering issue I
described.

I didn't actually run the cite tests. Will do so though.

The first two you talk about later, simple fix, the last one seems new
though, did you experience it as well?

2. Location of schema files

The pull request moves the csw schema files from the classpath root to a
package prefix of net/opengis so that AppSchemaResolver can find them. But
this then causes the classpath publisher to fail to find the and makes it
so that the schemas are no longer web accessible which causes a test to
fail.

An obvious solution would be to just copy the files back to the root,
keeping them in both places. But I wonder if AppSchemaResolver can be
configured to load them without the package prefix. Or we could potentially
modify ClasspathPublisher and add the ability to add a package prefix.

Imho it would be best to fix the AppSchemaResolver, since every other
schema in GeoServer is already
published without issues with the ClasspathPublisher

Agreed. Will need feedback from Ben or Niels on this one though. A simple
hack would be to try and fall back on new package prefix "net.opengis" or
to add a flag to forgo the use of the package prefix.

3. org.geoserver.csw.records.iso.GetRecordsTest failures

Some of the ISO GetRecords tests fail as they make
assertions assuming that the "cite:Forests" record will be in the first
page of results. Easy fix seemed to just increase the page size by using
maxRecords to ensure all results returned.

And that is it. Aside from those issues the modules build just fine. I
have pushed some changes up to my github that contain fixes for these
issues:

  https://github.com/jdeolive/geoserver/tree/csw

But i am not confident on the fixes for issues 1 and 2 above and would
like to hear from Andrea and Niels.

I also took the liberty of fixing some minor issues as well including:

* cleaned up formatting issues with changes to main module catalog
classes (tabs vs spaces)

* removed csw- prefix from csw modules, as per convention with other multi

module roots like web, etc... the directory tree now looks like:

        csw/
            api/
            core/
            simple-store/

The artifact ids retain the csw prefix though.

This works for me.

Here is some more I've found during the review.

There's a change in DefaultCatalogFacade.java that I'm not sure about:

if (null != sortByList) {
            for (int i = sortByList.length - 1; i >=0 ; i--) {
             SortBy sortBy = sortByList[i];
             Ordering<Object> ordering =
Ordering.from(comparator(sortBy));
            if (SortOrder.DESCENDING.equals(sortBy.getSortOrder())) {
                 ordering = ordering.reverse();
            }
            all = ordering.sortedCopy(all);
            }
        }

What guarantees are there that sorting the same list multiple times we'll
get
to a result that is ordered on sortBy1, then sortBy2, and so on?
What I'm trying to say is, the second time we sort the algorithm might as
well
ruin the partial ordering established by the first sort... the Java 6
algorithm might
work, but Java 7 has a different sorting algorithm (TimSort), not sure if
the
multiple sorts are going to be preserved there.

Anyways, no big deal, I guess it can be handled as a bug later, if it turns
out to misbehave.

Right, never noticed that. Agreed, it is a pretty isolated bit of code and
can tweak as required

What's a bit more troublesome is changes in the CatalogStore interface.
Mind, I'm not against them per se, but in order to change a interface that
was
discussed openly on the ml I would have expected the same opennes, or at
least some explanation on why they happened, instead I'm looking at the
diff trying to understand them.
The changes propagate in other portions as well, it seems bits of the CSW
service code have been refactored, again, without providing any clue as to
why.

I'm not saying the changes are right or wrong (on the contraty, I'm
confident
they have been done for good reasons), but they should have been
discussed: since the patch is out already, can we have some rationale
about them?

It would make reviewing the patch a lot easier.

Can't comment much on these, will have to delegate to niels here.

Cheers
Andrea

--

Our support, Your Success! Visit http://opensdi.geo-solutions.it for more
information.

Ing. Andrea Aime
@geowolf
Technical Lead

GeoSolutions S.A.S.
Via Poggio alle Viti 1187
55054 Massarosa (LU)
Italy
phone: +39 0584 962313
fax: +39 0584 1660272
mob: +39 339 8844549

http://www.geo-solutions.it
http://twitter.com/geosolutions_it

-------------------------------------------------------

--
Justin Deoliveira
OpenGeo - http://opengeo.org
Enterprise support for open source geospatial.

I see your point. I remember I needed to include support for sorting with more than one sortby, which wasn’t yet supported previously. But perhaps we have to revise that algorithm and see if there is a more appropriate solution. First my apologies for not explaining the changes in advance. I usually try to make sure I am certain that particular changes are necessary or useful before I propose them, perhaps that is why I didn’t do it earlier. Then I also have to apologise about something else, I did my best cleaning up all the code and filling in the javadoc before I did my pull request but I seemed to have missed this file. The javadoc for translateProperty wasn’t filled in, and on top of that one of the changes had to be removed before committing. The extra parameter elementSet was supposed to be removed again, this was a change I made at some point but became unnecessary later. So please remove this change all together, this can be done in the interface and all its implementations without a problem. Again, sorry about that, that was not my intention. The other two changes however were for a good reasons and have both to do with making CSW more generic. A lot of the generic stuff relies on using the RecordDescriptor interface. At points where the old code simply calls CSWRecordDescriptor.SOME_CONSTANT the new code often uses an instance of the RecordDescriptor interface and a method to get this information, for obvious reasons. Another thing is that as a consequence of parsing .xsd files the name of the record type is not any more in the FeatureType, but in an AttributeDescriptor. For example, the name of the CSW DC type is not “Record” but “RecordType”, there is a descriptor with the name “Record” that points to “RecordType”. This is good because it corresponds with how this is coded in the .xsd file. This is just one of the reasons why I don’t pass on FeatureType objects any more, but rather RecordDescriptor objects. From here all sorts of information can be retrieved, like a namespace set which is also frequently used. Considering the translateProperty method, this method converts a Name to PropertyName. This is used for property selection. This happens differently with the two Catalog Stores. The Abstract Catalog Store selects properties from a finished feature, so the PropertyName applies to a finished feature. The Internal Catalog Store selects properties earlier in the process, in the mapping, so it needs property names that the mapping understands (so for example, with ‘/value’ included). I hope this helps, if there are any more questions, please ask. Kind Regards Niels Charlier

···

Hello Andrea,

Sorry for my late reponses.

On 01/19/2013 06:54 PM, Andrea Aime wrote:

Here is some more I’ve found during the review.

There’s a change in DefaultCatalogFacade.java that I’m not sure about:

if (null != sortByList) {
for (int i = sortByList.length - 1; i >=0 ; i–) {
SortBy sortBy = sortByList[i];
Ordering ordering = Ordering.from(comparator(sortBy));
if (SortOrder.DESCENDING.equals(sortBy.getSortOrder())) {
ordering = ordering.reverse();
}
all = ordering.sortedCopy(all);
}
}

What guarantees are there that sorting the same list multiple times we’ll get
to a result that is ordered on sortBy1, then sortBy2, and so on?
What I’m trying to say is, the second time we sort the algorithm might as well
ruin the partial ordering established by the first sort… the Java 6 algorithm might
work, but Java 7 has a different sorting algorithm (TimSort), not sure if the
multiple sorts are going to be preserved there.

Anyways, no big deal, I guess it can be handled as a bug later, if it turns
out to misbehave.

What’s a bit more troublesome is changes in the CatalogStore interface.
Mind, I’m not against them per se, but in order to change a interface that was
discussed openly on the ml I would have expected the same opennes, or at
least some explanation on why they happened, instead I’m looking at the
diff trying to understand them.
The changes propagate in other portions as well, it seems bits of the CSW
service code have been refactored, again, without providing any clue as to why.

I’m not saying the changes are right or wrong (on the contraty, I’m confident
they have been done for good reasons), but they should have been
discussed: since the patch is out already, can we have some rationale about them?

It would make reviewing the patch a lot easier.

On Wed, Jan 23, 2013 at 11:10 AM, Niels Charlier <niels@anonymised.com> wrote:

Hello Andrea,

Sorry for my late reponses.

On 01/19/2013 06:54 PM, Andrea Aime wrote:

Here is some more I've found during the review.

There's a change in DefaultCatalogFacade.java that I'm not sure about:

  if (null != sortByList) {
            for (int i = sortByList.length - 1; i >=0 ; i--) {
             SortBy sortBy = sortByList[i];
             Ordering<Object> ordering =
Ordering.from(comparator(sortBy));
             if (SortOrder.DESCENDING.equals(sortBy.getSortOrder())) {
                 ordering = ordering.reverse();
             }
             all = ordering.sortedCopy(all);
            }
        }

What guarantees are there that sorting the same list multiple times
we'll get
to a result that is ordered on sortBy1, then sortBy2, and so on?
What I'm trying to say is, the second time we sort the algorithm might as
well
ruin the partial ordering established by the first sort... the Java 6
algorithm might
work, but Java 7 has a different sorting algorithm (TimSort), not sure if
the
multiple sorts are going to be preserved there.

I see your point. I remember I needed to include support for sorting with
more than one sortby, which wasn't yet supported previously.
But perhaps we have to revise that algorithm and see if there is a more
appropriate solution.

Anyways, no big deal, I guess it can be handled as a bug later, if it
turns
out to misbehave.

What's a bit more troublesome is changes in the CatalogStore interface.
Mind, I'm not against them per se, but in order to change a interface that
was
discussed openly on the ml I would have expected the same opennes, or at
least some explanation on why they happened, instead I'm looking at the
diff trying to understand them.
The changes propagate in other portions as well, it seems bits of the CSW
service code have been refactored, again, without providing any clue as to
why.

I'm not saying the changes are right or wrong (on the contraty, I'm
confident
they have been done for good reasons), but they should have been
discussed: since the patch is out already, can we have some rationale
about them?

It would make reviewing the patch a lot easier.

First my apologies for not explaining the changes in advance. I usually
try to make sure I am certain that particular changes are necessary or
useful before I propose them, perhaps that is why I didn't do it earlier.

Then I also have to apologise about something else, I did my best cleaning
up all the code and filling in the javadoc before I did my pull request but
I seemed to have missed this file. The javadoc for translateProperty wasn't
filled in, and on top of that one of the changes had to be removed before
committing. The extra parameter elementSet was supposed to be removed
again, this was a change I made at some point but became unnecessary later.
So please remove this change all together, this can be done in the
interface and all its implementations without a problem. Again, sorry about
that, that was not my intention.

So if I understand this correctly this means two additional changes.

1. Remove the ElementSet argument from CatalogStore.getRecords() method.
2. Remove CatalogStore.translateProperty()

(1) is trivial, more or less all calls to it passed in null. Made that
change and tests are happy.

(2) is not so clear to me. At the moment there is code calling this method,
and multiple implementations of of it, one from AbstractCatalogStore and
one from InternalCatalogStore which appear to be doing different things. So
it's not quite obvious to me what should be done here.

The other two changes however were for a good reasons and have both to do
with making CSW more generic.

A lot of the generic stuff relies on using the RecordDescriptor interface.
At points where the old code simply calls CSWRecordDescriptor.SOME_CONSTANT
the new code often uses an instance of the RecordDescriptor interface and a
method to get this information, for obvious reasons. Another thing is that
as a consequence of parsing .xsd files the name of the record type is not
any more in the FeatureType, but in an AttributeDescriptor. For example,
the name of the CSW DC type is not "Record" but "RecordType", there is a
descriptor with the name "Record" that points to "RecordType". This is good
because it corresponds with how this is coded in the .xsd file. This is
just one of the reasons why I don't pass on FeatureType objects any more,
but rather RecordDescriptor objects. From here all sorts of information can
be retrieved, like a namespace set which is also frequently used.

Considering the translateProperty method, this method converts a Name to
PropertyName. This is used for property selection. This happens differently
with the two Catalog Stores. The Abstract Catalog Store selects properties
from a finished feature, so the PropertyName applies to a finished feature.
The Internal Catalog Store selects properties earlier in the process, in
the mapping, so it needs property names that the mapping understands (so
for example, with '/value' included).

I hope this helps, if there are any more questions, please ask.

Kind Regards
Niels Charlier

--
Justin Deoliveira
OpenGeo - http://opengeo.org
Enterprise support for open source geospatial.

So if I understand this correctly this means two additional changes.

1. Remove the ElementSet argument from CatalogStore.getRecords() method.
2. Remove CatalogStore.translateProperty()

(1) is trivial, more or less all calls to it passed in null. Made that change and tests are happy.

(2) is not so clear to me. At the moment there is code calling this method, and multiple implementations of of it, one from AbstractCatalogStore and one from InternalCatalogStore which appear to be doing different things. So it's not quite obvious to me what should be done here.

Oh no, only 1 needs to be changed, the second is indeed necessary as I explained further in the email. I was just apologising for the missing javadoc for the method, making it harder to understand.

Don't worry - the changes are in fact trivial.

Kind Regards
Niels

On Thu, Jan 24, 2013 at 6:56 AM, Niels Charlier <niels@anonymised.com> wrote:

So if I understand this correctly this means two additional changes.

1. Remove the ElementSet argument from CatalogStore.getRecords() method.
2. Remove CatalogStore.**translateProperty()

(1) is trivial, more or less all calls to it passed in null. Made that
change and tests are happy.

(2) is not so clear to me. At the moment there is code calling this
method, and multiple implementations of of it, one from
AbstractCatalogStore and one from InternalCatalogStore which appear to be
doing different things. So it's not quite obvious to me what should be done
here.

Oh no, only 1 needs to be changed, the second is indeed necessary as I
explained further in the email. I was just apologising for the missing
javadoc for the method, making it harder to understand.

Ok cool. Thanks Niels. So (1) is indeed trivial, made that change.

For (2) all we need is a javadoc. I am still having a bit of trouble
understanding, mostly due to my inexperience with catalog. But it sounds
something like this:

/**
  * Maps a qualified name to it's equivalent property name used to query
the backend store.
  */

A couple of additional questions though.

1. This method wold never be called by "client" code, it is more an
internal method used to the mapping?
2. How does this method relate to RecordDescriptor.translateProperty() ?

Don't worry - the changes are in fact trivial.

Kind Regards
Niels

--
Justin Deoliveira
OpenGeo - http://opengeo.org
Enterprise support for open source geospatial.

Hi Justin,

In response to your questions.

On 01/25/2013 05:24 PM, Justin Deoliveira wrote:

1. This method wold never be called by "client" code, it is more an internal method used to the mapping?

I think it is right that it's probably not really useful to the "client" code, however it is used by the services GetDomain and GetRecords which don't know what kind of mapping they are working with.

2. How does this method relate to RecordDescriptor.translateProperty() ?

In the case of the InternalCatalogStore, the record store calls the method RecordDescriptor.translateProperty, which does not only convert from Name to PropertyName, it also makes changes to the syntax of the property depending on the kind of RecordDescriptor you are dealing with, so that it gets in the right form, for example for DC /dc:value is added to the path and for ISO the asipo 'short' property name is converted to a full path. In the case of the SimpleCatalogStore however, this method isn't called, and a simple convert from Name to PropertyName is all that is done here.

I hope this helps.

Kind Regards
Niels

Ok, i have updated the javadoc and pushed the discussed changes to the csw branch in my repo.

@Andrea: I’ll wait to hear from you about when to merge in or if you think we need a bit more deeper review.

···

On Mon, Jan 28, 2013 at 3:44 AM, Niels Charlier <niels@anonymised.com> wrote:

Hi Justin,

In response to your questions.

On 01/25/2013 05:24 PM, Justin Deoliveira wrote:

  1. This method wold never be called by “client” code, it is more an internal method used to the mapping?

I think it is right that it’s probably not really useful to the “client” code, however it is used by the services GetDomain and GetRecords which don’t know what kind of mapping they are working with.

  1. How does this method relate to RecordDescriptor.translateProperty() ?

In the case of the InternalCatalogStore, the record store calls the method RecordDescriptor.translateProperty, which does not only convert from Name to PropertyName, it also makes changes to the syntax of the property depending on the kind of RecordDescriptor you are dealing with, so that it gets in the right form, for example for DC /dc:value is added to the path and for ISO the asipo ‘short’ property name is converted to a full path. In the case of the SimpleCatalogStore however, this method isn’t called, and a simple convert from Name to PropertyName is all that is done here.

I hope this helps.

Kind Regards
Niels


Justin Deoliveira
OpenGeo - http://opengeo.org
Enterprise support for open source geospatial.

Just a ping on this one, wondering how to move forward.

···

On Mon, Jan 28, 2013 at 5:28 PM, Justin Deoliveira <jdeolive@anonymised.com> wrote:

Ok, i have updated the javadoc and pushed the discussed changes to the csw branch in my repo.

@Andrea: I’ll wait to hear from you about when to merge in or if you think we need a bit more deeper review.


Justin Deoliveira
OpenGeo - http://opengeo.org
Enterprise support for open source geospatial.

On Mon, Jan 28, 2013 at 3:44 AM, Niels Charlier <niels@anonymised.com> wrote:

Hi Justin,

In response to your questions.

On 01/25/2013 05:24 PM, Justin Deoliveira wrote:

  1. This method wold never be called by “client” code, it is more an internal method used to the mapping?

I think it is right that it’s probably not really useful to the “client” code, however it is used by the services GetDomain and GetRecords which don’t know what kind of mapping they are working with.

  1. How does this method relate to RecordDescriptor.translateProperty() ?

In the case of the InternalCatalogStore, the record store calls the method RecordDescriptor.translateProperty, which does not only convert from Name to PropertyName, it also makes changes to the syntax of the property depending on the kind of RecordDescriptor you are dealing with, so that it gets in the right form, for example for DC /dc:value is added to the path and for ISO the asipo ‘short’ property name is converted to a full path. In the case of the SimpleCatalogStore however, this method isn’t called, and a simple convert from Name to PropertyName is all that is done here.

I hope this helps.

Kind Regards
Niels


Justin Deoliveira
OpenGeo - http://opengeo.org
Enterprise support for open source geospatial.

On Wed, Feb 6, 2013 at 5:15 PM, Justin Deoliveira <jdeolive@anonymised.com> wrote:

Just a ping on this one, wondering how to move forward.

Justin,
unfortunately I did not find time to look into this further.
The idea that there are two separate ways to translate between the internal data model and
the exposed data model bugs me, seems redundant and needlessly complicated.

However, as long as there are community modules, I’d say go ahead and we’ll
try to fix this design problem later…

Cheers
Andrea

==
Our support, Your Success! Visit http://opensdi.geo-solutions.it for more information.

Ing. Andrea Aime
@geowolf
Technical Lead

GeoSolutions S.A.S.
Via Poggio alle Viti 1187
55054 Massarosa (LU)
Italy
phone: +39 0584 962313
fax: +39 0584 1660272
mob: +39 339 8844549

http://www.geo-solutions.it
http://twitter.com/geosolutions_it


Ok cool. I will commit the changes as is and will wait on guidance on how to proceed further.

···

On Wed, Feb 6, 2013 at 10:22 AM, Andrea Aime <andrea.aime@anonymised.com> wrote:

On Wed, Feb 6, 2013 at 5:15 PM, Justin Deoliveira <jdeolive@anonymised.com> wrote:

Just a ping on this one, wondering how to move forward.

Justin,
unfortunately I did not find time to look into this further.
The idea that there are two separate ways to translate between the internal data model and
the exposed data model bugs me, seems redundant and needlessly complicated.

However, as long as there are community modules, I’d say go ahead and we’ll
try to fix this design problem later…

Cheers

Andrea

==
Our support, Your Success! Visit http://opensdi.geo-solutions.it for more information.

Ing. Andrea Aime
@geowolf
Technical Lead

GeoSolutions S.A.S.
Via Poggio alle Viti 1187
55054 Massarosa (LU)
Italy
phone: +39 0584 962313
fax: +39 0584 1660272
mob: +39 339 8844549

http://www.geo-solutions.it
http://twitter.com/geosolutions_it



Justin Deoliveira
OpenGeo - http://opengeo.org
Enterprise support for open source geospatial.

On 02/06/2013 06:22 PM, Andrea Aime wrote:

The idea that there are two separate ways to translate between the internal data model and
the exposed data model bugs me, seems redundant and needlessly complicated.

How do you mean there are two separate ways to translate between internal and exposed data model?
I don't believe there is, I'm a bit puzzled here by what you mean exactly.

Niels

On Thu, Feb 7, 2013 at 1:57 PM, Niels Charlier <niels@anonymised.com> wrote:

On 02/06/2013 06:22 PM, Andrea Aime wrote:

The idea that there are two separate ways to translate between the internal data model and
the exposed data model bugs me, seems redundant and needlessly complicated.

How do you mean there are two separate ways to translate between internal and exposed data model?
I don’t believe there is, I’m a bit puzzled here by what you mean exactly.

The record descriptor provides one, and then you added another to the store

Cheers
Andrea

==
Our support, Your Success! Visit http://opensdi.geo-solutions.it for more information.

Ing. Andrea Aime
@geowolf
Technical Lead

GeoSolutions S.A.S.
Via Poggio alle Viti 1187
55054 Massarosa (LU)
Italy
phone: +39 0584 962313
fax: +39 0584 1660272
mob: +39 339 8844549

http://www.geo-solutions.it
http://twitter.com/geosolutions_it


Rather than being needlessly complicated I believe I have caused confusion by giving these two methods the same name - I should not have done that. What they have in common is that they both convert from Name to PropertyName, other than that they serve a completely different purpose. It is the method in the record descriptor that translates between the internal and exposed data model. The method in the store is merely there to tell what sort of conversion we need for a particular purpose; this is different in both stores for very good reasons, which I could elaborate. By creating the method it is possible to keep a bunch of other code the same for both stores. It’s all been a part of the process of making code more generic and prevent copy-pasting, and simplifying - not complicating things. I should be more careful with name choosing to avoid confusion though. Of course I’d be happy with any ideas or suggestions to improve the code. Niels

···

On 02/07/2013 02:02 PM, Andrea Aime wrote:

On Thu, Feb 7, 2013 at 1:57 PM, Niels Charlier <niels@anonymised.com> wrote:

On 02/06/2013 06:22 PM, Andrea Aime wrote:

The idea that there are two separate ways to translate between the internal data model and
the exposed data model bugs me, seems redundant and needlessly complicated.

How do you mean there are two separate ways to translate between internal and exposed data model?
I don’t believe there is, I’m a bit puzzled here by what you mean exactly.

The record descriptor provides one, and then you added another to the store

Thanks for the explanation Niels. Some comments/questions inline.

···

On Thu, Feb 7, 2013 at 6:18 AM, Niels Charlier <niels@anonymised.com> wrote:

On 02/07/2013 02:02 PM, Andrea Aime wrote:

On Thu, Feb 7, 2013 at 1:57 PM, Niels Charlier <niels@anonymised.com> wrote:

On 02/06/2013 06:22 PM, Andrea Aime wrote:

The idea that there are two separate ways to translate between the internal data model and
the exposed data model bugs me, seems redundant and needlessly complicated.

How do you mean there are two separate ways to translate between internal and exposed data model?
I don’t believe there is, I’m a bit puzzled here by what you mean exactly.

The record descriptor provides one, and then you added another to the store

Rather than being needlessly complicated I believe I have caused confusion by giving these two methods the same name - I should not have done that. What they have in common is that they both convert from Name to PropertyName, other than that they serve a completely different purpose.

Right, so let’s rename the method. What would be a good name?

It is the method in the record descriptor that translates between the internal and exposed data model.

The method in the store is merely there to tell what sort of conversion we need for a particular purpose; this is different in both stores for very good reasons, which I could elaborate. By creating the method it is possible to keep a bunch of other code the same for both stores. It’s all been a part of the process of making code more generic and prevent copy-pasting, and simplifying - not complicating things. I should be more careful with name choosing to avoid confusion though.

Makes sense. Maybe you could elaborate on how the method operates differently for both stores, if anything so we can come up with a suitable name?

Question. For RecordDescriptor.translateProperty is the “translation” one way, from the internal model to the exposed model? And similarly for the CatalogStore.translateProperty method, is the mapping from the exposed model to the internal model?

Of course I’d be happy with any ideas or suggestions to improve the code.

Niels


Justin Deoliveira
OpenGeo - http://opengeo.org
Enterprise support for open source geospatial.

Hi Justin,

Sorry for the late response, lost track of this a bit.

So, it is being used for GetRecords with properties selection. Also for GetDomain but that is just a special case of the same thing.
The simple catalog store will select the properties after building the feature using RetypingFeatureCollection, as it was already done. This it the easiest way to do it in this case. The user’s specified property names are the ones we need.

In the case of the internal catalog store, the situation is different. You cannot use the RetypingFeatureCollection because we are working with nested complex feature structure here and the properties can be paths. However the properties can also be names that are sort of shortcuts for the paths but don’t reflect the structure of the feature at all (see the iso metadata standard). And why build too much of the feature? Can make it considerably slower in this case. So that is why I select the properties on the level of the mapping, I cut down the mapping. It makes most sense here. So now the property names need to be understood by the mapping, i.e. we need a full translation of the property.

I guess you could call the method preparePropertyForSelection or something. The reason I made an extra method, is because the query wants PropertyNames, not Names, so they need to be passed as such in the query by whoever is calling the RecordStore. An alternative approach that would work in this case is to only do the basic conversion from propertyname to name and then let the internal catalog store do the extra conversion in the beginning of the getRecordsInternal method. If you really want to get rid of it, it seems like that should work as well, now I look at it again. I seem to remember there was a reason I didn’t do that though, but I can’t quite remember. The way I did it seemed the most logical at the time working from the existing code to make it more generic and support future alternatives.

For RecordDescriptor: No, the other way around, from exposed to internal.
For CatalogStore: in the case of the simple store (or the default case) it does not do this, it merely converts from Name to PropertyName.

Cheers
NIels

···

Rather than being needlessly complicated I believe I have caused confusion by giving these two methods the same name - I should not have done that. What they have in common is that they both convert from Name to PropertyName, other than that they serve a completely different purpose.

Right, so let’s rename the method. What would be a good name?

It is the method in the record descriptor that translates between the internal and exposed data model.

The method in the store is merely there to tell what sort of conversion we need for a particular purpose; this is different in both stores for very good reasons, which I could elaborate. By creating the method it is possible to keep a bunch of other code the same for both stores. It’s all been a part of the process of making code more generic and prevent copy-pasting, and simplifying - not complicating things. I should be more careful with name choosing to avoid confusion though.

Makes sense. Maybe you could elaborate on how the method operates differently for both stores, if anything so we can come up with a suitable name?

Question. For RecordDescriptor.translateProperty is the “translation” one way, from the internal model to the exposed model? And similarly for the CatalogStore.translateProperty method, is the mapping from the exposed model to the internal model?