We are exploring using CSW to coordinate the holding of multiple very
large PostGIS databases (100K+ and now approaching 1M entries) where
each entry is a reference to a separate image file with a different
bounding box. I think he is crazy but one our senior people is pushing
hard for us to use CSW to exchange the data.
The GETRECORDS search query would be fairly complex. The query would be
geospatial (the coverage bounding box) plus some attribute matching. The
response could be in the order of 10K -100K items.
I see in the CSW 2.02 specification that it APPEARS that I could use the
'startPosition' parameter of the GETRECORDS request as a sort of
database cursor, and use the 'nextRecord' and 'numberOfRecordsReturned'
as the handshake in the response to walk the results.
But I am concerned about the actual implementation in either DEEGREES or
GEONETWORK that they do not maintain state properly to be able to do
this sort of cursor-based walking of a very large GETRECORDS response.
(would they use the 'requestId' to keep a state-full partial response,
or would they do the query for the 1M records all over again?)
Can anyone tell me if this is indeed the case? Should I really tell him
that using CSW is crazy and we need to develop a different protocol?