[Geoserver-devel] Fixing how hits/numberMatched works in WFS 2.0 (and some considerations about WFS 1.0 and 1.1 too)

Hi,
we have had this ticket open for some time about how numberMatched is not exactly working as
it should in WFS 2.0:
http://jira.codehaus.org/browse/GEOS-6187

Looking at it I’ve reviewed the standards and the implementation to find a good overall solution.

The current code will return -1 for numberMatched under some conditions, see my other thread
about the need to actually return “unkonwn” if the count is not known.

I’ve prepared a pull request that should fix that issue for most cases in which we return -1 now,
but of course, more work will need to be done to fix the case where the store actually does not know:

https://github.com/geoserver/geoserver/pull/502

Basically the patch uses the approach Justin suggested in a comment, to do the total count later
only if necessary, but rolling a list of CountExecutor(s) that are collected as we fetch the feature
collections, and defer the decision on whether we really need to do a total count at the end.
If “count” (featuresReturned) is less than maxFeatures, then we don’t have to compute anything,
and regardless we try to keep around the result of the count for the first feature type,
to avoid that extra call.

If instead we realize we’re returning less features than matched, or we have a base offset,
then we will have to do the total count running the count executors.
This should give us good performance in the common case, and correctness in the general one.

Implementing it I’ve also noticed that we try to avoid doing the count itself for WFS 1.0, assuming
there is a single feature type requested, and looking at the WFS 1.1 schemas, I’ve noticed that
we could extend the optimization to WFS 1.1 too, since numberOfFeatures is optional:

<xsd:attribute name=“numberOfFeatures”
type=“xsd:nonNegativeInteger”
use=“optional”/>

Shall we do that? One less query :slight_smile:
The same cannot apply to WFS 2.0, where both numberMathced and numberReturned
and mandatory, and numberReturned is a “nonNegativeInteger”.

The ticket also talks about the hits behavior for wfs 1.0 and 1.1, and suggests to do a total
count regardless (basically, behaving as if the “ignoreMaxFeaturesHits” flag is always up).
Reading the specs I don’t agree and believe we should maintain our current behavior,
that allows the admin to configure how things should behave.

Here are the salient bits out of the wfs 1.1 spec, when it comes to describing how
WFS 1.

== Our support, Your Success! Visit http://opensdi.geo-solutions.it for more information ==

Ing. Andrea Aime

@geowolf
Technical Lead

GeoSolutions S.A.S.
Via Poggio alle Viti 1187
55054 Massarosa (LU)
Italy
phone: +39 0584 962313
fax: +39 0584 1660272
mob: +39 339 8844549

http://www.geo-solutions.it
http://twitter.com/geosolutions_it


Hem...
damn shortcuts, sent the mail by accident :slight_smile:

I was saying, here are the salient bits out of the wfs 1.1 spec, when it
comes to describing how
hits mode should work:

1) The value hits indicates that a web feature service should process the
GetFeature request and rather than return the entire result set, it should
*simply*
*indicate the number of feature instance of the requested feature type(s)
that*
*satisfy the request.*

2) If, however, the value of the resultType attribute is
specified as hits, a web feature service must generate a
<wfs:FeatureCollection>
element with no content (i.e. empty) but must populate the values of the
timeStamp
attribute and the numberOfFeatures attribute. *In this way a client may
obtain a count of*
*the number of features that a query would return without having to incur
the cost of*
*transmitting the entire result set.*

The first could be understood both ways, but the second is clear, we should
count the features
that _would be returned_ in results mode, so maxFeatures should be taken
into account
(and it also means that enabling the "ignore max features in hits mode"
flag puts the wfs
server against the spec, even if it's handy to have it behave that way).

Anyways.. no harm done imho, each administrator is free to decide how to
make it behave,
so I'd just leave things as they are

And this time, I'm really done :slight_smile:

Feedback?

Cheers
Andrea

--
== Our support, Your Success! Visit http://opensdi.geo-solutions.it for
more information ==

Ing. Andrea Aime
@geowolf
Technical Lead

GeoSolutions S.A.S.
Via Poggio alle Viti 1187
55054 Massarosa (LU)
Italy
phone: +39 0584 962313
fax: +39 0584 1660272
mob: +39 339 8844549

http://www.geo-solutions.it
http://twitter.com/geosolutions_it

-------------------------------------------------------

On Thu, Feb 27, 2014 at 6:29 PM, Andrea Aime
<andrea.aime@anonymised.com>wrote:

Feedback?

Went ahead and merged the pull request

Cheers
Andrea

--
== Our support, Your Success! Visit http://opensdi.geo-solutions.it for
more information ==

Ing. Andrea Aime
@geowolf
Technical Lead

GeoSolutions S.A.S.
Via Poggio alle Viti 1187
55054 Massarosa (LU)
Italy
phone: +39 0584 962313
fax: +39 0584 1660272
mob: +39 339 8844549

http://www.geo-solutions.it
http://twitter.com/geosolutions_it

-------------------------------------------------------