Hi,
the GetRecords operation is proving to be a bit of hard nut to crack due to some
of its “features”.
Basically, a GetRecords operation is quite similar to a GetFeature one, in that
you can ask for different types of record types, e.g., both dublin core and
ISO, which are structurally different.
What we are going to do is to turn each into a Query object and then
ask for the records to the CatalogStore, which can respond the way it wants:
it may decide that it does not have any ISO record for example, but some
csw:Record, or it may have an internal model mapping to both representations
and as such it would be able to respond to both queries (actually, if one store
targets ISO it has to respond dublic core too since that is mandatory).
In the latter case we could be returning the same information twice, in two
different formats.
And that per se it would not be the end of the world, if it wasn’t for the fact that there
is a third parameter, outputSchema, that controls how the record get encoded.
The outputSchema defaults to the csw:Record representation, but one could
ask for ISO or ebRIM.
So, see, one can query csw:Record but then have it returned as an ISO representation,
or ask for iso and ebrim and have it returned as csw:Record… holy mess!
Now, I’ve tried to see what other CSW implementations do in this case.
GeoNetwork has one and only one internal representation (which might or might not
be equal to one of the output formats), and then XSLT to the canonical outputs.
The XSLT is configurable since the internal representation can vary.
PyCSW has internal representations that are equal to some of the canonical outputs,
and seems to have some sort of universal translator that goes among types.
I was leaning a bit towards the second approach, which we could programmatically
execute against the features containing the records, but discussing it with Emanuele
he pointed some severe limitations of such approach:
- ebRim/EO and csw:Record and almost impossible to translate to each other in
a generic manner as they contain completely different information - csw:Record can hardly be translated to ISO in a compliant way as we would
not have enough info to build all of the compulsory ISO fields out of the
dublin core representation
In the end its the CatalogStore itself that is best placed to do such transformation,
since it as the internal model handy, so it can first translate the Query against
its internal model, execute it, and then convert the internal model to the desired
representation. For example, the store working against the GeoServer own catalog
could be queried with csw:Record but with outputSchema http://www.isotc211.org/2005/gmd,
it would then translate the csw:Record query against its internal model, run it,
and then encode the results in ISO records using the full set of information
available in the internal model
This would result in a modification of CatalogStore from:
FeatureCollection getRecords(Query q, Transaction t) throws IOException;
to
FeatureCollection getRecords(Query q, FeaureType targetSchema, Transaction t) throws IOException;
Now, doing this solves one problem but leaves a potential other open.
If I ask for both csw:Record and ISO with ISO as the output, it is most likely that we
result will contain the same records duplicated…
I guess it is such a corner case that we probably should not bother, imho the
GetRecords request is ill posed to start with, but wanted to gather some opinions about it
nevertheless
Soo… what do you think?
Cheers
Andrea
–
==
Our support, Your Success! Visit http://opensdi.geo-solutions.it for more information.
Ing. Andrea Aime
@geowolf
Technical Lead
GeoSolutions S.A.S.
Via Poggio alle Viti 1187
55054 Massarosa (LU)
Italy
phone: +39 0584 962313
fax: +39 0584 962313
mob: +39 339 8844549
http://www.geo-solutions.it
http://twitter.com/geosolutions_it