[GeoNetwork-devel] issues with OAI implementation

Hi,

while tracking down some OAI interoperability issues between jOAI and geonetwork I noticed that the the OAI implementation in Geonetwork might have some issues.

I was wondering if somebody familiar with the implementation is around to discuss.

I have noticed that the implementation of the "resumptionToken" has problems. Geonetwork refuses to be harvested by non Geonetwork OAI harvesters, complaining about "No session for token". This is because geonetwork maintains the state of a OAI conversation in a userSession,
which is fine for browsers, but not for other types of userAgents.

The only way to maintain state in OAI is by using the resumptionToken.
However, geonetwork requires the user to maintain the http session in order be able to retrieve the resultset corresponding to a resumptionToken.

The resumptionToken allows a (large) OAI reply to be split up into several pages. (a bit like paging). By supplying the resumptionToken the user scrolls forward in the cursor.

In geonetwork the result-set corresponding to a OAI request is stored in the userSession. This means that a session has to be created and maintained in order for the resumptionToken to work. AFAIK geonetwork uses jsessionid for this. The implication is that in order for your session maintaining resumptionToken to work, you also need to maintain the state via jsessionid, which contradicts the idea of the resumptionToken being the only way to maintain state.

What is more is that only one OAI resultset can be stored in the userSession, with a newer one overwriting an older one. This means that one cannot have multiple OAI searches in the same browser.

A second problem concerns the validation by Geonetwork of OAI replies by other OAI providers. For example, the attached file shows a reply to a ListSets request. This cannot be validated against
http://www.openarchives.org/OAI/2.0/OAI-PMH.xsd, since the <setDescription> element contains a new namespace (oai_dc) which also has to be validated (strict in the XMLschema). However, the corresponding XMLschema file is of course not known to geonetwork.
This could be fixed by including the oai_dc schema into the OAI-PMH.xsd, but since <setDescription> can contain any namespace this would not be a generic solution.
I dont know how XML validation of subtrees having arbitrary (and unknown) namespaces is supposed to work.

In Geonetwork the problem is that the categories cannot be retrieved from e.G jOAI via listSets, because GN cannot validate the reply.

sorry for the long email.
I hope somebody can enlighten me.

best regards
Timo

(attachments)

joai_listsets_response.xml (882 Bytes)