[GeoNetwork-users] Paging of results from CSW GETRECORDS

Let me make this a simpler question: Does GeoNetwork support paged
delivery of results from a GETRECORDS that returns 100K+ hits? (ISO
profile)

[I.e. so that the result is not in a single HTTP response body (XML).

I see in the CSW 2.02 specification that it APPEARS that I could use the

'startPosition' parameter of the GETRECORDS request as a sort of

database cursor, and use the 'nextRecord' and 'numberOfRecordsReturned'

as the handshake in the response to walk the results.

I have already found out that DEEGREES does not support this out the of
box.

I am concerned about the actual implementation in GEONETWORK that it
does not maintain state properly to be able to do
this sort of cursor-based walking of a very large GETRECORDS response.
(would they use the 'requestId' to keep a state-full partial response,
or would they do the query for the 1M records all over again?)

-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-

Y.Gutfreund, Ph.D.

Principal Member of Technical Staff

-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-

Yes It is possible in geonetwork. You can try to build your request XML as
follows

        <?xml version="1.0"?>
        <csw:GetRecords
xmlns:csw="http://www.opengis.net/cat/csw/2.0.2&quot; service="CSW"
version="2.0.2" resultType="results" outputSchema="csw:IsoRecord"
maxRecords="100" startPosition="200" >
        
        <csw:Query typeNames="gmd:MD_Metadata">
        <csw:Constraint version="1.1.0">
        <Filter xmlns="http://www.opengis.net/ogc&quot;
xmlns:gml="http://www.opengis.net/gml&quot;/&gt;
        </csw:Constraint>
        </csw:Query>
        </csw:GetRecords>

And put this in loop and increment the startPosition

Kumaran
-----Original Message-----
From: Gutfreund, Yechezkal [mailto:ygutfreund@anonymised.com]
Sent: Wednesday, December 07, 2011 9:42 PM
To: geonetwork-users@lists.sourceforge.net
Subject: [GeoNetwork-users] Paging of results from CSW GETRECORDS

Let me make this a simpler question: Does GeoNetwork support paged delivery
of results from a GETRECORDS that returns 100K+ hits? (ISO
profile)

[I.e. so that the result is not in a single HTTP response body (XML).

I see in the CSW 2.02 specification that it APPEARS that I could use the

'startPosition' parameter of the GETRECORDS request as a sort of

database cursor, and use the 'nextRecord' and 'numberOfRecordsReturned'

as the handshake in the response to walk the results.

I have already found out that DEEGREES does not support this out the of box.

I am concerned about the actual implementation in GEONETWORK that it does
not maintain state properly to be able to do this sort of cursor-based
walking of a very large GETRECORDS response.
(would they use the 'requestId' to keep a state-full partial response, or
would they do the query for the 1M records all over again?)

-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-

Y.Gutfreund, Ph.D.

Principal Member of Technical Staff

-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-

----------------------------------------------------------------------------
--
Cloud Services Checklist: Pricing and Packaging Optimization This white
paper is intended to serve as a reference, checklist and point of discussion
for anyone considering optimizing the pricing and packaging model of a cloud
services business. Read Now!
http://www.accelacomm.com/jaw/sfnl/114/51491232/
_______________________________________________
GeoNetwork-users mailing list
GeoNetwork-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/geonetwork-users
GeoNetwork OpenSource is maintained at
http://sourceforge.net/projects/geonetwork

That is what I would expect from the syntax of the 2.0.2 SPEC. But I
was wondering if the GeoNetwork Server actually maintains state
information between HTTP calls (and what does it do to maintain this
state and context) so that a complex filter (using spatial intersection,
attributes, etc.) is not run over and over again for the 10M records
that we have in our PostGis Database (which we copied into the
GeoNetwork Database).
It would seem that the actually server would have to create a thread at
the first call (and return a quick short list right away to avoid the
timeout) and then continue to responds to subsequent requests each time
the "same query" (whatever that means) is invoked with a new start
position.

I know the DEEGREES folk said that is what they would have to do, and
they asked for $$ to implement it.

-----Original Message-----
From: Kumaran Narayanaswamy
[mailto:kumaran.narayanaswamy@anonymised.com]
Sent: Wednesday, December 07, 2011 1:02 PM
To: Gutfreund, Yechezkal
Cc: geonetwork-users@lists.sourceforge.net
Subject: RE: [GeoNetwork-users] Paging of results from CSW GETRECORDS

Yes It is possible in geonetwork. You can try to build your request XML
as follows

        <?xml version="1.0"?>
        <csw:GetRecords
xmlns:csw="http://www.opengis.net/cat/csw/2.0.2&quot; service="CSW"
version="2.0.2" resultType="results" outputSchema="csw:IsoRecord"
maxRecords="100" startPosition="200" >
        
        <csw:Query typeNames="gmd:MD_Metadata">
        <csw:Constraint version="1.1.0">
        <Filter xmlns="http://www.opengis.net/ogc&quot;
xmlns:gml="http://www.opengis.net/gml&quot;/&gt;
        </csw:Constraint>
        </csw:Query>
        </csw:GetRecords>

And put this in loop and increment the startPosition

Kumaran
-----Original Message-----
From: Gutfreund, Yechezkal [mailto:ygutfreund@anonymised.com]
Sent: Wednesday, December 07, 2011 9:42 PM
To: geonetwork-users@lists.sourceforge.net
Subject: [GeoNetwork-users] Paging of results from CSW GETRECORDS

Let me make this a simpler question: Does GeoNetwork support paged
delivery of results from a GETRECORDS that returns 100K+ hits? (ISO
profile)

[I.e. so that the result is not in a single HTTP response body (XML).

I see in the CSW 2.02 specification that it APPEARS that I could use the

'startPosition' parameter of the GETRECORDS request as a sort of

database cursor, and use the 'nextRecord' and 'numberOfRecordsReturned'

as the handshake in the response to walk the results.

I have already found out that DEEGREES does not support this out the of
box.

I am concerned about the actual implementation in GEONETWORK that it
does not maintain state properly to be able to do this sort of
cursor-based walking of a very large GETRECORDS response.
(would they use the 'requestId' to keep a state-full partial response,
or would they do the query for the 1M records all over again?)

-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-

Y.Gutfreund, Ph.D.

Principal Member of Technical Staff

-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-

------------------------------------------------------------------------
----
--
Cloud Services Checklist: Pricing and Packaging Optimization This white
paper is intended to serve as a reference, checklist and point of
discussion for anyone considering optimizing the pricing and packaging
model of a cloud services business. Read Now!
http://www.accelacomm.com/jaw/sfnl/114/51491232/
_______________________________________________
GeoNetwork-users mailing list
GeoNetwork-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/geonetwork-users
GeoNetwork OpenSource is maintained at
http://sourceforge.net/projects/geonetwork