Hi, I see a problem in my proposal. For long GetFeature requests, two “JDBCFeatureSource::getReaderInternal()” calls are executed, much worse than “JDBCFeatureSource::getCountInternal()” + “JDBCFeatureSource::getReaderInternal()” calls.
The best, preserve the cache directly in the shapefile provider.
Do you agree?
De: A Huarte ahuarte47@anonymised.com
Para: Andrea Aime andrea.aime@anonymised.com
CC: “geoserver-devel@lists.sourceforge.net” geoserver-devel@lists.sourceforge.net
Enviado: Viernes 6 de noviembre de 2015 9:09
Asunto: Re: [Geoserver-devel] [NEW FEATURE] New configuration and memory caching of features in WFS GetFeature requests.
To try clarify my comments, better a little of code:
https://github.com/geoserver/geoserver/compare/master…ahuarte47:GEOS-7296_fix_read_twice

Alvaro
De: A Huarte ahuarte47@anonymised.com
Para: Andrea Aime <andrea.aime@anonymised.com…>
CC: “geoserver-devel@lists.sourceforge.net” geoserver-devel@anonymised.comeforge.net
Enviado: Viernes 6 de noviembre de 2015 9:04
Asunto: Re: [Geoserver-devel] [NEW FEATURE] New configuration and memory caching of features in WFS GetFeature requests.
On Thu, Nov 5, 2015 at 11:52 AM, A Huarte <ahuarte47@anonymised.com> wrote:
Hi, I would propose this new feature to solve the issue https://osgeo-org.atlassian.net/browse/GEOS-7296
Now, the WFS requests of 1.1/2.0 versions force to read the features from the data source (and also filter evaluation) twice. Or each time a new iterator is created and visited.
This issue is especially problematic with big data sources or complex/heavy querys.
Mind, your observation about reading the data twice is true only for shapefiles, a proper spatial database can count much faster without
the need to actually load all the data.
Hi Andrea, you are right, I verified same behavior in postgis layers. Each WFS GetFeature request (v1.1/2.0) executes two SQL requests to the database similar to shapefile:
We agree that postgis, or any modern database, is fast, but IMHO I think there is an unnecessary waste of database resources, and performance!, of each GetFeature request if it manages results with a litte count of records (10, 100, 1000?). Now, the initial count and later full query always are executed, we can reduce to one unique request with no overrun caching the data when it contains a little of quantity of records, otherwise we can preserve the current behavior to avoid overloading the RAM.
Alvaro
De: Andrea Aime andrea.aime@anonymised.com
Para: A Huarte ahuarte47@anonymised.com
CC: “geoserver-devel@lists.sourceforge.net” geoserver-devel@lists.sourceforge.net
Enviado: Jueves 5 de noviembre de 2015 12:04
Asunto: Re: [Geoserver-devel] [NEW FEATURE] New configuration and memory caching of features in WFS GetFeature requests.
On Thu, Nov 5, 2015 at 11:52 AM, A Huarte <ahuarte47@anonymised.com> wrote:
Hi, I would propose this new feature to solve the issue https://osgeo-org.atlassian.net/browse/GEOS-7296
Now, the WFS requests of 1.1/2.0 versions force to read the features from the data source (and also filter evaluation) twice. Or each time a new iterator is created and visited.
This issue is especially problematic with big data sources or complex/heavy querys.
Mind, your observation about reading the data twice is true only for shapefiles, a proper spatial database can count much faster without
the need to actually load all the data.
I propose to debate a new configurable capability to cache results from the data provider in order to avoid this.
A good point to cache the results can be:
https://github.com/geoserver/geoserver/blob/master/src/wfs/src/main/java/org/geoserver/wfs/GetFeature.java#L530
This pull https://github.com/geoserver/geoserver/pull/1321 write the fix, but still it needs a new setting in the WFS admin panel.
The change as written is not acceptable imho, there must be a size limit for the cached collection, like, cache at most 1000 features,
where 1000 is the configurable bit.
No one is their right mind would want to cache all the results all the time, it would fit only a very narrow use case in which all data sources
are small or there is much RAM than data, and very low traffic (each request keeps its own result in memory),
typical installations have a mix of small and large ones, with small and large data extractions.
The usefulness is also debatable in general terms… if I have a database, do I really want to read and cache up to 1000 features from the db
during to replace the first count operation? Thinking about it, it would seem to me this would be useful only on a per dataset basis,
and configured manually only for those datasets that cannot do a fast count.
Cheers
Andrea
–
==
GeoServer Professional Services from the experts! Visit
http://goo.gl/it488V for more information.
Ing. Andrea Aime
@geowolf
Technical Lead
GeoSolutions S.A.S.
Via Poggio alle Viti 1187
55054 Massarosa (LU)
Italy
phone: +39 0584 962313
fax: +39 0584 1660272
mob: +39 339 8844549
http://www.geo-solutions.it
http://twitter.com/geosolutions_it
AVVERTENZE AI SENSI DEL D.Lgs. 196/2003
Le informazioni contenute in questo messaggio di posta elettronica e/o nel/i file/s allegato/i sono da considerarsi strettamente riservate. Il loro utilizzo è consentito esclusivamente al destinatario del messaggio, per le finalità indicate nel messaggio stesso. Qualora riceviate questo messaggio senza esserne il destinatario, Vi preghiamo cortesemente di darcene notizia via e-mail e di procedere alla distruzione del messaggio stesso, cancellandolo dal Vostro sistema. Conservare il messaggio stesso, divulgarlo anche in parte, distribuirlo ad altri soggetti, copiarlo, od utilizzarlo per finalità diverse, costituisce comportamento contrario ai principi dettati dal D.Lgs. 196/2003.
The information in this message and/or attachments, is intended solely for the attention and use of the named addressee(s) and may be confidential or proprietary in nature or covered by the provisions of privacy act (Legislative Decree June, 30 2003, no.196 - Italy’s New Data Protection Code).Any use not in accord with its purpose, any disclosure, reproduction, copying, distribution, or either dissemination, either whole or partial, is strictly forbidden except previous formal approval of the named addressee(s). If you are not the intended recipient, please contact immediately the sender by telephone, fax or e-mail and delete the information in this message that has been received in error. The sender does not give any warranty or accept liability as the content, accuracy or completeness of sent messages and accepts no responsibility for changes made after they were sent or for other risks which arise as a result of e-mail transmission, viruses, etc.
Geoserver-devel mailing list
Geoserver-devel@anonymised.comnet
https://lists.sourceforge.net/lists/listinfo/geoserver-devel