On Thu, Jan 17, 2013 at 2:10 AM, Justin Deoliveira <jdeolive@anonymised.com> wrote:
Hi all,
I spent a few hours today reviewing the pending pull request from Niels that adds support for iso application profile and a catalog store based directly from the geoserver catalog to the existing csw community module.
https://github.com/geoserver/geoserver/pull/84/files
Hi,
just having a look at it right now.
All in all it looks great. Big thanks to Niels and Andrea for all this work. I did however run into a few hiccups when trying to build the modules with tests.
- UTF-8 vs ISO-8859-1 encoding issues
A few tests fail with error message complaining about invalid UTF-8 byte sequences. Some of the test files contain some extended latin characters. For instance:
https://github.com/geoserver/geoserver/blob/master/src/community/csw/csw-core/src/test/resources/org/geoserver/csw/records/Record_e9330592-0932-474b-be34-c3a3bb67c7db.xml
The error occurs when one of the tests tries to parse a GetRecord response that contains the record. The resulting doc reports a UTF-8 encoding and the getAsDOM method fails to parse it. Not sure the best way to solve this one but i was able to fix the test with two approaches.
a. Change the record transformer to report a document encoding of ISO-8859-1
b. Specify manually the ISO-8859-1 encoding when parsing the response, essentially ovrerriding the UTF-8 encoding reported on the doc
This is an example of one of the tests that fail.
https://github.com/geoserver/geoserver/blob/master/src/community/csw/csw-core/src/test/java/org/geoserver/csw/GetRecordsTest.java#L243
Hmm… that file comes straigth from CITE tests data for CSW 2.0.2, did you try running
the CITE tests after the change?
I have no failures due to encoding, but I’m seeing those:
testAllRecords(org.geoserver.csw.store.internal.GetRecordsTest): expected:<[abstract about Forests]> but was:<>
testAllRecords(org.geoserver.csw.records.iso.GetRecordsTest): expected:<[abstract about Forests]> but was:<>
testAllRecordsBrief(org.geoserver.csw.records.iso.GetRecordsTest): expected:<[urn:x-ogc:def:crs:EPSG:6.11:4326]> but was:<>
The first two you talk about later, simple fix, the last one seems new though, did you experience it as well?
- Location of schema files
The pull request moves the csw schema files from the classpath root to a package prefix of net/opengis so that AppSchemaResolver can find them. But this then causes the classpath publisher to fail to find the and makes it so that the schemas are no longer web accessible which causes a test to fail.
An obvious solution would be to just copy the files back to the root, keeping them in both places. But I wonder if AppSchemaResolver can be configured to load them without the package prefix. Or we could potentially modify ClasspathPublisher and add the ability to add a package prefix.
Imho it would be best to fix the AppSchemaResolver, since every other schema in GeoServer is already
published without issues with the ClasspathPublisher
- org.geoserver.csw.records.iso.GetRecordsTest failures
Some of the ISO GetRecords tests fail as they make assertions assuming that the “cite:Forests” record will be in the first page of results. Easy fix seemed to just increase the page size by using maxRecords to ensure all results returned.
And that is it. Aside from those issues the modules build just fine. I have pushed some changes up to my github that contain fixes for these issues:
https://github.com/jdeolive/geoserver/tree/csw
But i am not confident on the fixes for issues 1 and 2 above and would like to hear from Andrea and Niels.
I also took the liberty of fixing some minor issues as well including:
- cleaned up formatting issues with changes to main module catalog classes (tabs vs spaces)
- removed csw- prefix from csw modules, as per convention with other multi module roots like web, etc… the directory tree now looks like:
csw/
api/
core/
simple-store/
The artifact ids retain the csw prefix though.
This works for me.
Here is some more I’ve found during the review.
There’s a change in DefaultCatalogFacade.java that I’m not sure about:
if (null != sortByList) {
for (int i = sortByList.length - 1; i >=0 ; i–) {
SortBy sortBy = sortByList[i];
Ordering ordering = Ordering.from(comparator(sortBy));
if (SortOrder.DESCENDING.equals(sortBy.getSortOrder())) {
ordering = ordering.reverse();
}
all = ordering.sortedCopy(all);
}
}
What guarantees are there that sorting the same list multiple times we’ll get
to a result that is ordered on sortBy1, then sortBy2, and so on?
What I’m trying to say is, the second time we sort the algorithm might as well
ruin the partial ordering established by the first sort… the Java 6 algorithm might
work, but Java 7 has a different sorting algorithm (TimSort), not sure if the
multiple sorts are going to be preserved there.
Anyways, no big deal, I guess it can be handled as a bug later, if it turns
out to misbehave.
What’s a bit more troublesome is changes in the CatalogStore interface.
Mind, I’m not against them per se, but in order to change a interface that was
discussed openly on the ml I would have expected the same opennes, or at
least some explanation on why they happened, instead I’m looking at the
diff trying to understand them.
The changes propagate in other portions as well, it seems bits of the CSW
service code have been refactored, again, without providing any clue as to why.
I’m not saying the changes are right or wrong (on the contraty, I’m confident
they have been done for good reasons), but they should have been
discussed: since the patch is out already, can we have some rationale about them?
It would make reviewing the patch a lot easier.
Cheers
Andrea
–
Ing. Andrea Aime
@geowolf
Technical Lead
GeoSolutions S.A.S.
Via Poggio alle Viti 1187
55054 Massarosa (LU)
Italy
phone: +39 0584 962313
fax: +39 0584 1660272
mob: +39 339 8844549
http://www.geo-solutions.it
http://twitter.com/geosolutions_it