[Geoserver-devel] Reducing the amount of datastore interactions for small GetFeature requests

Hi,
in the current WFS architecture GeoServer is setup to “favour” large requests,
with a fully streaming protocol that helps serving back basically limitless
amount of GML without having to be memory bound… at most, we are
disk bound in certain output formats (shape-zip, we have to write the
shapefile fully before compressing it).

In order to do that we perform feature counts before actually fetching the
data, as in many responses the count is given at the beginning of the response.
And then we actually fetch the data.

Now, for small returned datasets, this is overkill, we could theoretically just
fetch the data once, store it in memory, count it, and then use it for encoding.
Just one request, and in some cases, that’s a significant speedup:

  • Complex database views/sqlviews doing heavy computation, where the
    output fetching time is actually of little consequence
  • Custom data stores that are equally doing heavy computations to figure
    out which features to provide as part of the results
  • Also see the ArcSDE slowness issue recently reported by Martin Davis
    on the users list, where going for a count is systematically slower than
    actually fetching the data itself

With this in mind, it would be nice to have some smarts in WFS so that
we can just hit the data stores once for small results.
The thing is, we don’t know if the result is small until we fetch it :slight_smile:

I’ve hashed a few ideas, all based on the notion that when asked
for size, we should really try to fetch the data instead,
and do something “special” only in case we find we are loading too much of it.

1) Fetch and fall back on count

When the feature collection is asked for its size, start fetching data
instead, count in the progress, up until a certain limit (e.g., 1000 features),
if we finish before, return the count and keep the result aside in memory
waiting for the features() call.

If we hit the limit, keep that iterator open, run a normal size(), and
when features() is called, use the features we have in memory up until
that point, then start scrolling the cached iterator.

This one seems nice, but it’s a no go, as it doubles the number of database connections
needed to answer a single GetFeature request, which means it’s deadlock
prone once we reach the connection pool size limits.

2) Fetch and fall back on count v2

Same as above, but if we hit the limit, close the iterator,
do a normal count, and when features() si called,
start a fetch again.

No deadlock issues, but for requests over the limit, we are actually
looking at 3 requests to the store, instead of 2, making those slower
than they are.
Arguably, this will make large requests slower, but the ones really
affected will be the ones around the threshold

3) Fetch and store

Same as above, but if we go past the limit, we continue fetching
and store all features on disk using the fast serializer that we have
in GeoTools in the merge-sort sorting algorithm, and read back
from disk when features() is called

Of all the approaches I think I like 3) the best, as it has dynamics
similar to the shapefile output format (so, nothing new really).
The one downside of 3) is that it will limit this optimization to
simple features.

Also wondering if this would be a default behavior, something configurable
at the wfs config level, or on a layer by layer basis.

Now… I’m looking at this from the point of view of a customer project, and
making an effort to create something that would be useful to the community
at large, but I see there are trade-offs that might prevent it to be a good
general solutions.

So, I’m soliciting your feedback, I have a (short) window to try and work on this,
so quick feedback is very much appreciated (and slow one too, with the
notion that it will be useful anyways for the readers trying to approach this
issue in the future, in case I don’t make it now)

Cheers
Andrea

···

==

GeoServer Professional Services from the experts! Visit
http://goo.gl/NWWaa2 for more information.

==

Ing. Andrea Aime

@geowolf
Technical Lead

GeoSolutions S.A.S.
Via Poggio alle Viti 1187
55054 Massarosa (LU)
Italy
phone: +39 0584 962313
fax: +39 0584 1660272
mob: +39 339 8844549

http://www.geo-solutions.it
http://twitter.com/geosolutions_it

AVVERTENZE AI SENSI DEL D.Lgs. 196/2003

Le informazioni contenute in questo messaggio di posta elettronica e/o nel/i file/s allegato/i sono da considerarsi strettamente riservate. Il loro utilizzo è consentito esclusivamente al destinatario del messaggio, per le finalità indicate nel messaggio stesso. Qualora riceviate questo messaggio senza esserne il destinatario, Vi preghiamo cortesemente di darcene notizia via e-mail e di procedere alla distruzione del messaggio stesso, cancellandolo dal Vostro sistema. Conservare il messaggio stesso, divulgarlo anche in parte, distribuirlo ad altri soggetti, copiarlo, od utilizzarlo per finalità diverse, costituisce comportamento contrario ai principi dettati dal D.Lgs. 196/2003.

The information in this message and/or attachments, is intended solely for the attention and use of the named addressee(s) and may be confidential or proprietary in nature or covered by the provisions of privacy act (Legislative Decree June, 30 2003, no.196 - Italy’s New Data Protection Code).Any use not in accord with its purpose, any disclosure, reproduction, copying, distribution, or either dissemination, either whole or partial, is strictly forbidden except previous formal approval of the named addressee(s). If you are not the intended recipient, please contact immediately the sender by telephone, fax or e-mail and delete the information in this message that has been received in error. The sender does not give any warranty or accept liability as the content, accuracy or completeness of sent messages and accepts no responsibility for changes made after they were sent or for other risks which arise as a result of e-mail transmission, viruses, etc.


Sorry my reply was just to Andrea - idea based on partial buffer strategy:

If this is only for GML …
Start writing the GML out into Memory (or Disk) and mark down where the size needs to be written in the file. When you have finished writing everything out, double back (in memory or a random access file) and update the size in place.

Turns out he is looking for more than just GML as we have getCount() scattered in a number of workflows.

I am still not quite sure how to handle this one, it is very similar to the caching a feature source case described in our quick start. I am a bit hesitant to make work harder for datastore authors and am trying to come up with an external solution. I know Andrea enjoys writing things out to disk as per shapefile sorting solution (as used for “Fetch and store”).

Q: I am going to guess that any of the three suggestions will be implemented as a FeatureCollection wrapper? or now that all the stores are content datastore based did you want to handle this in the base class?

···

On 8 April 2015 at 09:32, Andrea Aime <andrea.aime@anonymised.com> wrote:

Hi,
in the current WFS architecture GeoServer is setup to “favour” large requests,
with a fully streaming protocol that helps serving back basically limitless
amount of GML without having to be memory bound… at most, we are
disk bound in certain output formats (shape-zip, we have to write the
shapefile fully before compressing it).

In order to do that we perform feature counts before actually fetching the
data, as in many responses the count is given at the beginning of the response.
And then we actually fetch the data.

Now, for small returned datasets, this is overkill, we could theoretically just
fetch the data once, store it in memory, count it, and then use it for encoding.
Just one request, and in some cases, that’s a significant speedup:

  • Complex database views/sqlviews doing heavy computation, where the
    output fetching time is actually of little consequence
  • Custom data stores that are equally doing heavy computations to figure
    out which features to provide as part of the results
  • Also see the ArcSDE slowness issue recently reported by Martin Davis
    on the users list, where going for a count is systematically slower than
    actually fetching the data itself

With this in mind, it would be nice to have some smarts in WFS so that
we can just hit the data stores once for small results.
The thing is, we don’t know if the result is small until we fetch it :slight_smile:

I’ve hashed a few ideas, all based on the notion that when asked
for size, we should really try to fetch the data instead,
and do something “special” only in case we find we are loading too much of it.

1) Fetch and fall back on count

When the feature collection is asked for its size, start fetching data
instead, count in the progress, up until a certain limit (e.g., 1000 features),
if we finish before, return the count and keep the result aside in memory
waiting for the features() call.

If we hit the limit, keep that iterator open, run a normal size(), and
when features() is called, use the features we have in memory up until
that point, then start scrolling the cached iterator.

This one seems nice, but it’s a no go, as it doubles the number of database connections
needed to answer a single GetFeature request, which means it’s deadlock
prone once we reach the connection pool size limits.

2) Fetch and fall back on count v2

Same as above, but if we hit the limit, close the iterator,
do a normal count, and when features() si called,
start a fetch again.

No deadlock issues, but for requests over the limit, we are actually
looking at 3 requests to the store, instead of 2, making those slower
than they are.
Arguably, this will make large requests slower, but the ones really
affected will be the ones around the threshold

3) Fetch and store

Same as above, but if we go past the limit, we continue fetching
and store all features on disk using the fast serializer that we have
in GeoTools in the merge-sort sorting algorithm, and read back
from disk when features() is called

Of all the approaches I think I like 3) the best, as it has dynamics
similar to the shapefile output format (so, nothing new really).
The one downside of 3) is that it will limit this optimization to
simple features.

Also wondering if this would be a default behavior, something configurable
at the wfs config level, or on a layer by layer basis.

Now… I’m looking at this from the point of view of a customer project, and
making an effort to create something that would be useful to the community
at large, but I see there are trade-offs that might prevent it to be a good
general solutions.

So, I’m soliciting your feedback, I have a (short) window to try and work on this,
so quick feedback is very much appreciated (and slow one too, with the
notion that it will be useful anyways for the readers trying to approach this
issue in the future, in case I don’t make it now)

Cheers
Andrea

==

GeoServer Professional Services from the experts! Visit
http://goo.gl/NWWaa2 for more information.

==

Ing. Andrea Aime

@geowolf
Technical Lead

GeoSolutions S.A.S.
Via Poggio alle Viti 1187
55054 Massarosa (LU)
Italy
phone: +39 0584 962313
fax: +39 0584 1660272
mob: +39 339 8844549

http://www.geo-solutions.it
http://twitter.com/geosolutions_it

AVVERTENZE AI SENSI DEL D.Lgs. 196/2003

Le informazioni contenute in questo messaggio di posta elettronica e/o nel/i file/s allegato/i sono da considerarsi strettamente riservate. Il loro utilizzo è consentito esclusivamente al destinatario del messaggio, per le finalità indicate nel messaggio stesso. Qualora riceviate questo messaggio senza esserne il destinatario, Vi preghiamo cortesemente di darcene notizia via e-mail e di procedere alla distruzione del messaggio stesso, cancellandolo dal Vostro sistema. Conservare il messaggio stesso, divulgarlo anche in parte, distribuirlo ad altri soggetti, copiarlo, od utilizzarlo per finalità diverse, costituisce comportamento contrario ai principi dettati dal D.Lgs. 196/2003.

The information in this message and/or attachments, is intended solely for the attention and use of the named addressee(s) and may be confidential or proprietary in nature or covered by the provisions of privacy act (Legislative Decree June, 30 2003, no.196 - Italy’s New Data Protection Code).Any use not in accord with its purpose, any disclosure, reproduction, copying, distribution, or either dissemination, either whole or partial, is strictly forbidden except previous formal approval of the named addressee(s). If you are not the intended recipient, please contact immediately the sender by telephone, fax or e-mail and delete the information in this message that has been received in error. The sender does not give any warranty or accept liability as the content, accuracy or completeness of sent messages and accepts no responsibility for changes made after they were sent or for other risks which arise as a result of e-mail transmission, viruses, etc.



BPM Camp - Free Virtual Workshop May 6th at 10am PDT/1PM EDT
Develop your own process in accordance with the BPMN 2 standard
Learn Process modeling best practices with Bonita BPM through live exercises
http://www.bonitasoft.com/be-part-of-it/events/bpm-camp-virtual- event?utm_
source=Sourceforge_BPM_Camp_5_6_15&utm_medium=email&utm_campaign=VA_SF


Geoserver-devel mailing list
Geoserver-devel@anonymised.comsts.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/geoserver-devel


Jody Garnett

On Thu, Apr 9, 2015 at 7:52 PM, Jody Garnett <jody.garnett@anonymised.com> wrote:

Sorry my reply was just to Andrea - idea based on partial buffer strategy:

If this is only for GML ....

Start writing the GML out into Memory (or Disk) and mark down where the
size needs to be written in the file. When you have finished writing
everything out, double back (in memory or a random access file) and update
the size in place.

Turns out he is looking for more than just GML as we have getCount()
scattered in a number of workflows.

I am still not quite sure how to handle this one, it is very similar to
the caching a feature source case described in our quick start. I am a bit
hesitant to make work harder for datastore authors and am trying to come up
with an external solution. I know Andrea enjoys writing things out to disk
as per shapefile sorting solution (as used for "Fetch and store").

Q: I am going to guess that any of the three suggestions will be
implemented as a FeatureCollection wrapper? or now that all the stores are
content datastore based did you want to handle this in the base class?

I was thinking of going with a wrapper, yep

Cheers
Andrea

--

GeoServer Professional Services from the experts! Visit
http://goo.gl/NWWaa2 for more information.

Ing. Andrea Aime
@geowolf
Technical Lead

GeoSolutions S.A.S.
Via Poggio alle Viti 1187
55054 Massarosa (LU)
Italy
phone: +39 0584 962313
fax: +39 0584 1660272
mob: +39 339 8844549

http://www.geo-solutions.it
http://twitter.com/geosolutions_it

*AVVERTENZE AI SENSI DEL D.Lgs. 196/2003*

Le informazioni contenute in questo messaggio di posta elettronica e/o
nel/i file/s allegato/i sono da considerarsi strettamente riservate. Il
loro utilizzo è consentito esclusivamente al destinatario del messaggio,
per le finalità indicate nel messaggio stesso. Qualora riceviate questo
messaggio senza esserne il destinatario, Vi preghiamo cortesemente di
darcene notizia via e-mail e di procedere alla distruzione del messaggio
stesso, cancellandolo dal Vostro sistema. Conservare il messaggio stesso,
divulgarlo anche in parte, distribuirlo ad altri soggetti, copiarlo, od
utilizzarlo per finalità diverse, costituisce comportamento contrario ai
principi dettati dal D.Lgs. 196/2003.

The information in this message and/or attachments, is intended solely for
the attention and use of the named addressee(s) and may be confidential or
proprietary in nature or covered by the provisions of privacy act
(Legislative Decree June, 30 2003, no.196 - Italy's New Data Protection
Code).Any use not in accord with its purpose, any disclosure, reproduction,
copying, distribution, or either dissemination, either whole or partial, is
strictly forbidden except previous formal approval of the named
addressee(s). If you are not the intended recipient, please contact
immediately the sender by telephone, fax or e-mail and delete the
information in this message that has been received in error. The sender
does not give any warranty or accept liability as the content, accuracy or
completeness of sent messages and accepts no responsibility for changes
made after they were sent or for other risks which arise as a result of
e-mail transmission, viruses, etc.

-------------------------------------------------------

For sorting the wrapper is applied by the base class. The advantage of doing this is allowing the datastore implementation to make the decision if wrapping is beneficial - but as you outline it is hard to tell when this is the case.

As with any cache there is a small danger of lag between initial data access and subsequent feature access. In this case I kind of like the disk based cache solution as the response is built up around a result which is now stored offline.

···

On 9 April 2015 at 10:54, Andrea Aime <andrea.aime@anonymised.com> wrote:


Jody Garnett

On Thu, Apr 9, 2015 at 7:52 PM, Jody Garnett <jody.garnett@anonymised.com> wrote:

Sorry my reply was just to Andrea - idea based on partial buffer strategy:

If this is only for GML …
Start writing the GML out into Memory (or Disk) and mark down where the size needs to be written in the file. When you have finished writing everything out, double back (in memory or a random access file) and update the size in place.

Turns out he is looking for more than just GML as we have getCount() scattered in a number of workflows.

I am still not quite sure how to handle this one, it is very similar to the caching a feature source case described in our quick start. I am a bit hesitant to make work harder for datastore authors and am trying to come up with an external solution. I know Andrea enjoys writing things out to disk as per shapefile sorting solution (as used for “Fetch and store”).

Q: I am going to guess that any of the three suggestions will be implemented as a FeatureCollection wrapper? or now that all the stores are content datastore based did you want to handle this in the base class?

I was thinking of going with a wrapper, yep

Cheers

Andrea

==

GeoServer Professional Services from the experts! Visit
http://goo.gl/NWWaa2 for more information.

==

Ing. Andrea Aime

@geowolf
Technical Lead

GeoSolutions S.A.S.
Via Poggio alle Viti 1187
55054 Massarosa (LU)
Italy
phone: +39 0584 962313
fax: +39 0584 1660272
mob: +39 339 8844549

http://www.geo-solutions.it
http://twitter.com/geosolutions_it

AVVERTENZE AI SENSI DEL D.Lgs. 196/2003

Le informazioni contenute in questo messaggio di posta elettronica e/o nel/i file/s allegato/i sono da considerarsi strettamente riservate. Il loro utilizzo è consentito esclusivamente al destinatario del messaggio, per le finalità indicate nel messaggio stesso. Qualora riceviate questo messaggio senza esserne il destinatario, Vi preghiamo cortesemente di darcene notizia via e-mail e di procedere alla distruzione del messaggio stesso, cancellandolo dal Vostro sistema. Conservare il messaggio stesso, divulgarlo anche in parte, distribuirlo ad altri soggetti, copiarlo, od utilizzarlo per finalità diverse, costituisce comportamento contrario ai principi dettati dal D.Lgs. 196/2003.

The information in this message and/or attachments, is intended solely for the attention and use of the named addressee(s) and may be confidential or proprietary in nature or covered by the provisions of privacy act (Legislative Decree June, 30 2003, no.196 - Italy’s New Data Protection Code).Any use not in accord with its purpose, any disclosure, reproduction, copying, distribution, or either dissemination, either whole or partial, is strictly forbidden except previous formal approval of the named addressee(s). If you are not the intended recipient, please contact immediately the sender by telephone, fax or e-mail and delete the information in this message that has been received in error. The sender does not give any warranty or accept liability as the content, accuracy or completeness of sent messages and accepts no responsibility for changes made after they were sent or for other risks which arise as a result of e-mail transmission, viruses, etc.


Hi all,

Sorry for a slight diversion, but this is related to counting features for WFS responses: Is there a way to set an upper limit for the size of the returned feature collection for GetFeature request in Geoserver configuration? I know that users can use maxFeatures parameter, but in many cases it would make sense to return an exception if a query would return too many features.

This is also related to a broader security issue of limiting WFS queries which are estimated to be too costly for the server or the database to execute. It’s all too easy to do a DOS attack to a WFS server if there is no limit in how expensive the GetFeature requests are allowed to be. To make this work well would of course require something similar to query cost estimation implemented in database management systems, but it would be nice if the solution would be generic enough to work with non-DB data sources too.

Regards,

Ilkka Rinne

···

On Wed, Apr 8, 2015 at 7:32 PM, Andrea Aime <andrea.aime@anonymised.com> wrote:

Hi,
in the current WFS architecture GeoServer is setup to “favour” large requests,
with a fully streaming protocol that helps serving back basically limitless
amount of GML without having to be memory bound… at most, we are
disk bound in certain output formats (shape-zip, we have to write the
shapefile fully before compressing it).

In order to do that we perform feature counts before actually fetching the
data, as in many responses the count is given at the beginning of the response.
And then we actually fetch the data.

Now, for small returned datasets, this is overkill, we could theoretically just
fetch the data once, store it in memory, count it, and then use it for encoding.
Just one request, and in some cases, that’s a significant speedup:

  • Complex database views/sqlviews doing heavy computation, where the
    output fetching time is actually of little consequence
  • Custom data stores that are equally doing heavy computations to figure
    out which features to provide as part of the results
  • Also see the ArcSDE slowness issue recently reported by Martin Davis
    on the users list, where going for a count is systematically slower than
    actually fetching the data itself

With this in mind, it would be nice to have some smarts in WFS so that
we can just hit the data stores once for small results.
The thing is, we don’t know if the result is small until we fetch it :slight_smile:

I’ve hashed a few ideas, all based on the notion that when asked
for size, we should really try to fetch the data instead,
and do something “special” only in case we find we are loading too much of it.

1) Fetch and fall back on count

When the feature collection is asked for its size, start fetching data
instead, count in the progress, up until a certain limit (e.g., 1000 features),
if we finish before, return the count and keep the result aside in memory
waiting for the features() call.

If we hit the limit, keep that iterator open, run a normal size(), and
when features() is called, use the features we have in memory up until
that point, then start scrolling the cached iterator.

This one seems nice, but it’s a no go, as it doubles the number of database connections
needed to answer a single GetFeature request, which means it’s deadlock
prone once we reach the connection pool size limits.

2) Fetch and fall back on count v2

Same as above, but if we hit the limit, close the iterator,
do a normal count, and when features() si called,
start a fetch again.

No deadlock issues, but for requests over the limit, we are actually
looking at 3 requests to the store, instead of 2, making those slower
than they are.
Arguably, this will make large requests slower, but the ones really
affected will be the ones around the threshold

3) Fetch and store

Same as above, but if we go past the limit, we continue fetching
and store all features on disk using the fast serializer that we have
in GeoTools in the merge-sort sorting algorithm, and read back
from disk when features() is called

Of all the approaches I think I like 3) the best, as it has dynamics
similar to the shapefile output format (so, nothing new really).
The one downside of 3) is that it will limit this optimization to
simple features.

Also wondering if this would be a default behavior, something configurable
at the wfs config level, or on a layer by layer basis.

Now… I’m looking at this from the point of view of a customer project, and
making an effort to create something that would be useful to the community
at large, but I see there are trade-offs that might prevent it to be a good
general solutions.

So, I’m soliciting your feedback, I have a (short) window to try and work on this,
so quick feedback is very much appreciated (and slow one too, with the
notion that it will be useful anyways for the readers trying to approach this
issue in the future, in case I don’t make it now)

Cheers
Andrea

==

GeoServer Professional Services from the experts! Visit
http://goo.gl/NWWaa2 for more information.

==

Ing. Andrea Aime

@geowolf
Technical Lead

GeoSolutions S.A.S.
Via Poggio alle Viti 1187
55054 Massarosa (LU)
Italy
phone: +39 0584 962313
fax: +39 0584 1660272
mob: +39 339 8844549

http://www.geo-solutions.it
http://twitter.com/geosolutions_it

AVVERTENZE AI SENSI DEL D.Lgs. 196/2003

Le informazioni contenute in questo messaggio di posta elettronica e/o nel/i file/s allegato/i sono da considerarsi strettamente riservate. Il loro utilizzo è consentito esclusivamente al destinatario del messaggio, per le finalità indicate nel messaggio stesso. Qualora riceviate questo messaggio senza esserne il destinatario, Vi preghiamo cortesemente di darcene notizia via e-mail e di procedere alla distruzione del messaggio stesso, cancellandolo dal Vostro sistema. Conservare il messaggio stesso, divulgarlo anche in parte, distribuirlo ad altri soggetti, copiarlo, od utilizzarlo per finalità diverse, costituisce comportamento contrario ai principi dettati dal D.Lgs. 196/2003.

The information in this message and/or attachments, is intended solely for the attention and use of the named addressee(s) and may be confidential or proprietary in nature or covered by the provisions of privacy act (Legislative Decree June, 30 2003, no.196 - Italy’s New Data Protection Code).Any use not in accord with its purpose, any disclosure, reproduction, copying, distribution, or either dissemination, either whole or partial, is strictly forbidden except previous formal approval of the named addressee(s). If you are not the intended recipient, please contact immediately the sender by telephone, fax or e-mail and delete the information in this message that has been received in error. The sender does not give any warranty or accept liability as the content, accuracy or completeness of sent messages and accepts no responsibility for changes made after they were sent or for other risks which arise as a result of e-mail transmission, viruses, etc.



BPM Camp - Free Virtual Workshop May 6th at 10am PDT/1PM EDT
Develop your own process in accordance with the BPMN 2 standard
Learn Process modeling best practices with Bonita BPM through live exercises
http://www.bonitasoft.com/be-part-of-it/events/bpm-camp-virtual- event?utm_
source=Sourceforge_BPM_Camp_5_6_15&utm_medium=email&utm_campaign=VA_SF


Geoserver-devel mailing list
Geoserver-devel@anonymised.comsts.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/geoserver-devel

Ilkka Rinne
Founder, Chief Technology Officer, Spatineo Oy
Email: ilkka.rinne@anonymised.com
Skype: ilkka.o.rinne, phone: +358 50 523 8974
Linnankoskenkatu 16 A 17, FI-00250 Helsinki, Finland
www.spatineo.com, twitter.com/#!/spatineo
Google+ google.com/+Spatineo
www.linkedin.com/company/spatineo-inc

On Fri, Apr 10, 2015 at 10:01 AM, Ilkka Rinne <ilkka.rinne@anonymised.com>
wrote:

Hi all,

Sorry for a slight diversion, but this is related to counting features for
WFS responses: Is there a way to set an upper limit for the size of the
returned feature collection for GetFeature request in Geoserver
configuration? I know that users can use maxFeatures parameter, but in many
cases it would make sense to return an exception if a query would return
too many features.

If I understand the specification right, that would be against the WFS
requirements, I don't see anywhere in the spec
an allowance for the server to return an exception in that case, the spec
just allows to return a max number of features,
from there on, the client must page (in terms of WFS 2.0, before there was
nothing the client could do besides maybe
roll some sort of spatial paging approach).

This is also related to a broader security issue of limiting WFS queries
which are estimated to be too costly for the server or the database to
execute. It's all too easy to do a DOS attack to a WFS server if there is
no limit in how expensive the GetFeature requests are allowed to be. To
make this work well would of course require something similar to query cost
estimation implemented in database management systems, but it would be nice
if the solution would be generic enough to work with non-DB data sources
too.

Yes, I agree that is also very much a problem, but again, I believe such a
server would not be spec compliant.
What you can do is limited, you can list which filters you support,
generically, whilst what you'd actually need
would be to know which fields are indexed, which joins (WFS 2.0) are
expensive, and so on.

In general, I guess we're missing a call to check if a filter is going to
be accepted, so that the client can
figure out a different strategy... downside, that would make it very hard
to write a generic client...

Cheers
Andrea

--

GeoServer Professional Services from the experts! Visit
http://goo.gl/NWWaa2 for more information.

Ing. Andrea Aime
@geowolf
Technical Lead

GeoSolutions S.A.S.
Via Poggio alle Viti 1187
55054 Massarosa (LU)
Italy
phone: +39 0584 962313
fax: +39 0584 1660272
mob: +39 339 8844549

http://www.geo-solutions.it
http://twitter.com/geosolutions_it

*AVVERTENZE AI SENSI DEL D.Lgs. 196/2003*

Le informazioni contenute in questo messaggio di posta elettronica e/o
nel/i file/s allegato/i sono da considerarsi strettamente riservate. Il
loro utilizzo è consentito esclusivamente al destinatario del messaggio,
per le finalità indicate nel messaggio stesso. Qualora riceviate questo
messaggio senza esserne il destinatario, Vi preghiamo cortesemente di
darcene notizia via e-mail e di procedere alla distruzione del messaggio
stesso, cancellandolo dal Vostro sistema. Conservare il messaggio stesso,
divulgarlo anche in parte, distribuirlo ad altri soggetti, copiarlo, od
utilizzarlo per finalità diverse, costituisce comportamento contrario ai
principi dettati dal D.Lgs. 196/2003.

The information in this message and/or attachments, is intended solely for
the attention and use of the named addressee(s) and may be confidential or
proprietary in nature or covered by the provisions of privacy act
(Legislative Decree June, 30 2003, no.196 - Italy's New Data Protection
Code).Any use not in accord with its purpose, any disclosure, reproduction,
copying, distribution, or either dissemination, either whole or partial, is
strictly forbidden except previous formal approval of the named
addressee(s). If you are not the intended recipient, please contact
immediately the sender by telephone, fax or e-mail and delete the
information in this message that has been received in error. The sender
does not give any warranty or accept liability as the content, accuracy or
completeness of sent messages and accepts no responsibility for changes
made after they were sent or for other risks which arise as a result of
e-mail transmission, viruses, etc.

-------------------------------------------------------