[Geoserver-users] geoserver hit counts > server max features [SEC=UNCLASSIFIED]

Hi List,

I'm running a geoserver instance with a 10K max feature limit. A colleague of mine is interested in obtaining feature counts for a datasource containing 30K+ features.

He can find this information by making multiple requests with resultType=hits and using the built-in geoserver paging but it would be a lot nicer if he could just make one request and get an accurate count with the server ignoring the configured maximum feature limit for the purposes of an accurate feature count.

Is this something that's easy to change? Would there be any interest in a patch to allow this?

Thanks,
Geoff

On Tue, Jan 22, 2013 at 6:53 AM, Geoff Williams <G.Williams2@anonymised.com…> wrote:

Hi List,

I’m running a geoserver instance with a 10K max feature limit. A colleague of mine is interested in obtaining feature counts for a datasource containing 30K+ features.

He can find this information by making multiple requests with resultType=hits and using the built-in geoserver paging but it would be a lot nicer if he could just make one request and get an accurate count with the server ignoring the configured maximum feature limit for the purposes of an accurate feature count.

I see. The limit is working on count too for good reasons:

  • counting can be expensive (can take minutes) if you have datasets with tens or hundreds of millions of records
  • the filter expressed in the WFS request can make it much worse, for example, you might filter on an
    attribute that’s not indexed, or use WFS 2.0 to ask for an expensive join

Is this something that’s easy to change? Would there be any interest in a patch to allow this?

I believe it could be of interest, but it would have to be configurable, either a separate max value
for the count operation or a flag disabling the existing max for the count operation.
The classes to be modified would be WFSInfo, WFSInfoImpl (the configuration bits), the WFSAdminPage,
and the GetFeature class to take into account the above configurations.

Since this is an API change, it has to take place on the unstable series before (which, starting
from today, means the 2.4.x series, the 2.3.x series is entering hardening today in preparation
for the 2.3.0 release in a couple of months).

If you want to persue this please discuss the changes you want to make on the geoserver-devel
list.

Cheers
Andrea

==
Our support, Your Success! Visit http://opensdi.geo-solutions.it for more information.

Ing. Andrea Aime
@geowolf
Technical Lead

GeoSolutions S.A.S.
Via Poggio alle Viti 1187
55054 Massarosa (LU)
Italy
phone: +39 0584 962313
fax: +39 0584 1660272
mob: +39 339 8844549

http://www.geo-solutions.it
http://twitter.com/geosolutions_it


Thanks Andrea,

I thought there might be a good reason for limit. I ran the implications of this patch past my employer and in this case, they've decide to sidestep the problem by just displaying

"more then $SERVER_LIMIT records" instead of an exact record count.

I can see a use for having an exact record count though - I might have a crack at doing this outside of working hours. I'll be in touch on the developers list.

Cheers,
Geoff

________________________________
From: andrea.aime@anonymised.com [mailto:andrea.aime@anonymised.com] On Behalf Of Andrea Aime
Sent: Tuesday, 22 January 2013 19:22
To: Geoff Williams
Cc: geoserver-users@lists.sourceforge.net
Subject: Re: [Geoserver-users] geoserver hit counts > server max features [SEC=UNCLASSIFIED]

I see. The limit is working on count too for good reasons:
- counting can be expensive (can take minutes) if you have datasets with tens or hundreds of millions of records
- the filter expressed in the WFS request can make it much worse, for example, you might filter on an
  attribute that's not indexed, or use WFS 2.0 to ask for an expensive join

Is this something that's easy to change? Would there be any interest in a patch to allow this?

I believe it could be of interest, but it would have to be configurable, either a separate max value
for the count operation or a flag disabling the existing max for the count operation.
The classes to be modified would be WFSInfo, WFSInfoImpl (the configuration bits), the WFSAdminPage,
and the GetFeature class to take into account the above configurations.

Since this is an API change, it has to take place on the unstable series before (which, starting
from today, means the 2.4.x series, the 2.3.x series is entering hardening today in preparation
for the 2.3.0 release in a couple of months).

If you want to persue this please discuss the changes you want to make on the geoserver-devel
list.

Cheers
Andrea

--

Our support, Your Success! Visit http://opensdi.geo-solutions.it for more information.

Ing. Andrea Aime
@geowolf
Technical Lead

GeoSolutions S.A.S.
Via Poggio alle Viti 1187
55054 Massarosa (LU)
Italy
phone: +39 0584 962313
fax: +39 0584 1660272
mob: +39 339 8844549

http://www.geo-solutions.it
http://twitter.com/geosolutions_it

-------------------------------------------------------