[Geoserver-users] Performance of ECQL-Filter with up to 5000 IDs

Hi,

so do you think the database connection is the bottleneck? I would need some pointer on where to optimize or at least how to find the bottleneck (or how can I do some profiling/benchmarking/debugging to find the bottleneck?)

Or is it just normal that an ECQL-Filter on many IDs is very slow?

Best,

Jens

···

Von: Nachtigall, Jens (init) [mailto:Jens.Nachtigall@…5799…]
Gesendet: Freitag, 31. Juli 2015 16:29
An: geoserver-users@lists.sourceforge.net
Betreff: Re: [Geoserver-users] Performance of ECQL-Filter with up to 5000 IDs

It’s Oracle.

Von: andrea.aime@…84… [mailto:andrea.aime@…84…] Im Auftrag von Andrea Aime
Gesendet: Freitag, 31. Juli 2015 16:11
An: Nachtigall, Jens (init)
Cc: geoserver-users@lists.sourceforge.net
Betreff: Re: [Geoserver-users] Performance of ECQL-Filter with up to 5000 IDs

On Fri, Jul 31, 2015 at 4:02 PM, Nachtigall, Jens (init) <Jens.Nachtigall@…5799…> wrote:

Hi,

I have a WMS request including a rather long ECQL-Filter that matches on feature IDs (as described in http://docs.geoserver.org/latest/en/user/tutorials/cql/cql_tutorial.html#id-and-list-comparisons).

The filter is like this:

IN (‘mylayer.101453’, …up to 5000 IDs here… ,‘mylayer.102486’)

If there are only a few hundred IDs the response is at about 500ms, but with up to 5000 it becomes 15-30 seconds. Any ideas on how to optimize? 2, 3 seconds would sound acceptable but half a minute is a bit too much.

What is the datastore backing the request?

Cheers

Andrea

==

GeoServer Professional Services from the experts! Visit

http://goo.gl/it488V for more information.

==

Ing. Andrea Aime

@geowolf

Technical Lead

GeoSolutions S.A.S.

Via Poggio alle Viti 1187

55054 Massarosa (LU)

Italy

phone: +39 0584 962313

fax: +39 0584 1660272

mob: +39 339 8844549

http://www.geo-solutions.it

http://twitter.com/geosolutions_it

AVVERTENZE AI SENSI DEL D.Lgs. 196/2003

Le informazioni contenute in questo messaggio di posta elettronica e/o nel/i file/s allegato/i sono da considerarsi strettamente riservate. Il loro utilizzo è consentito esclusivamente al destinatario del messaggio, per le finalità indicate nel messaggio stesso. Qualora riceviate questo messaggio senza esserne il destinatario, Vi preghiamo cortesemente di darcene notizia via e-mail e di procedere alla distruzione del messaggio stesso, cancellandolo dal Vostro sistema. Conservare il messaggio stesso, divulgarlo anche in parte, distribuirlo ad altri soggetti, copiarlo, od utilizzarlo per finalità diverse, costituisce comportamento contrario ai principi dettati dal D.Lgs. 196/2003.

The information in this message and/or attachments, is intended solely for the attention and use of the named addressee(s) and may be confidential or proprietary in nature or covered by the provisions of privacy act (Legislative Decree June, 30 2003, no.196 - Italy’s New Data Protection Code).Any use not in accord with its purpose, any disclosure, reproduction, copying, distribution, or either dissemination, either whole or partial, is strictly forbidden except previous formal approval of the named addressee(s). If you are not the intended recipient, please contact immediately the sender by telephone, fax or e-mail and delete the information in this message that has been received in error. The sender does not give any warranty or accept liability as the content, accuracy or completeness of sent messages and accepts no responsibility for changes made after they were sent or for other risks which arise as a result of e-mail transmission, viruses, etc.


Hi Jens,

without knowing anything about your system, it's probably a question of optimizing the SQL which results from the CQL. Consider things like indexing - is the database utilizing any index? Also, a query which fetches specific IDs is probably not the optimum way to fetch thousands of features. Consider whether the query can be expressed better, or possibly whether the data needs to be remodelled.

Just a few loose ideas :slight_smile:

/julian
________________________________________
Fra: Nachtigall, Jens (init) [Jens.Nachtigall@anonymised.com]
Sendt: 3. august 2015 09:59
Til: geoserver-users@lists.sourceforge.net
Emne: Re: [Geoserver-users] Performance of ECQL-Filter with up to 5000 IDs

Hi,

so do you think the database connection is the bottleneck? I would need some pointer on where to optimize or at least how to find the bottleneck (or how can I do some profiling/benchmarking/debugging to find the bottleneck?)

Or is it just normal that an ECQL-Filter on many IDs is very slow?

Best,
Jens

Von: Nachtigall, Jens (init) [mailto:Jens.Nachtigall@anonymised.com]
Gesendet: Freitag, 31. Juli 2015 16:29
An: geoserver-users@lists.sourceforge.net
Betreff: Re: [Geoserver-users] Performance of ECQL-Filter with up to 5000 IDs

It’s Oracle.

Von: andrea.aime@anonymised.com<mailto:andrea.aime@anonymised.com> [mailto:andrea.aime@…84…] Im Auftrag von Andrea Aime
Gesendet: Freitag, 31. Juli 2015 16:11
An: Nachtigall, Jens (init)
Cc: geoserver-users@lists.sourceforge.net<mailto:geoserver-users@anonymised.comceforge.net>
Betreff: Re: [Geoserver-users] Performance of ECQL-Filter with up to 5000 IDs

On Fri, Jul 31, 2015 at 4:02 PM, Nachtigall, Jens (init) <Jens.Nachtigall@anonymised.com..5799...<mailto:Jens.Nachtigall@anonymised.com>> wrote:
Hi,

I have a WMS request including a rather long ECQL-Filter that matches on feature IDs (as described in http://docs.geoserver.org/latest/en/user/tutorials/cql/cql_tutorial.html#id-and-list-comparisons).

The filter is like this:
IN ('mylayer.101453', ...up to 5000 IDs here... ,'mylayer.102486')

If there are only a few hundred IDs the response is at about 500ms, but with up to 5000 it becomes 15-30 seconds. Any ideas on how to optimize? 2, 3 seconds would sound acceptable but half a minute is a bit too much.

What is the datastore backing the request?

Cheers
Andrea

--

GeoServer Professional Services from the experts! Visit
http://goo.gl/it488V for more information.

Ing. Andrea Aime
@geowolf
Technical Lead

GeoSolutions S.A.S.
Via Poggio alle Viti 1187
55054 Massarosa (LU)
Italy
phone: +39 0584 962313
fax: +39 0584 1660272
mob: +39 339 8844549

http://www.geo-solutions.it
http://twitter.com/geosolutions_it

AVVERTENZE AI SENSI DEL D.Lgs. 196/2003

Le informazioni contenute in questo messaggio di posta elettronica e/o nel/i file/s allegato/i sono da considerarsi strettamente riservate. Il loro utilizzo è consentito esclusivamente al destinatario del messaggio, per le finalità indicate nel messaggio stesso. Qualora riceviate questo messaggio senza esserne il destinatario, Vi preghiamo cortesemente di darcene notizia via e-mail e di procedere alla distruzione del messaggio stesso, cancellandolo dal Vostro sistema. Conservare il messaggio stesso, divulgarlo anche in parte, distribuirlo ad altri soggetti, copiarlo, od utilizzarlo per finalità diverse, costituisce comportamento contrario ai principi dettati dal D.Lgs. 196/2003.

The information in this message and/or attachments, is intended solely for the attention and use of the named addressee(s) and may be confidential or proprietary in nature or covered by the provisions of privacy act (Legislative Decree June, 30 2003, no.196 - Italy's New Data Protection Code).Any use not in accord with its purpose, any disclosure, reproduction, copying, distribution, or either dissemination, either whole or partial, is strictly forbidden except previous formal approval of the named addressee(s). If you are not the intended recipient, please contact immediately the sender by telephone, fax or e-mail and delete the information in this message that has been received in error. The sender does not give any warranty or accept liability as the content, accuracy or completeness of sent messages and accepts no responsibility for changes made after they were sent or for other risks which arise as a result of e-mail transmission, viruses, etc.

-------------------------------------------------------

On Mon, Aug 3, 2015 at 9:59 AM, Nachtigall, Jens (init) <
Jens.Nachtigall@anonymised.com> wrote:

Hi,

so do you think the database connection is the bottleneck? I would need
some pointer on where to optimize or at least how to find the bottleneck
(or how can I do some profiling/benchmarking/debugging to find the
bottleneck?)

Or is it just normal that an ECQL-Filter on many IDs is very slow?

I think that a SQL query with 5000 ids is probably slow. Check your SQL
query setting the
logging level to "geotools developer logging".
I know we translate it to a long series of "pk = firstValue or pk =
secondValue or ...",
a "pk in (v1, v2, ...)" would be more compact, but any database worth its
salt
should realize the two are equivalent at the query planner level (but with
Oracle
one never knows)

Cheers
Andrea

--

GeoServer Professional Services from the experts! Visit
http://goo.gl/it488V for more information.

Ing. Andrea Aime
@geowolf
Technical Lead

GeoSolutions S.A.S.
Via Poggio alle Viti 1187
55054 Massarosa (LU)
Italy
phone: +39 0584 962313
fax: +39 0584 1660272
mob: +39 339 8844549

http://www.geo-solutions.it
http://twitter.com/geosolutions_it

*AVVERTENZE AI SENSI DEL D.Lgs. 196/2003*

Le informazioni contenute in questo messaggio di posta elettronica e/o
nel/i file/s allegato/i sono da considerarsi strettamente riservate. Il
loro utilizzo è consentito esclusivamente al destinatario del messaggio,
per le finalità indicate nel messaggio stesso. Qualora riceviate questo
messaggio senza esserne il destinatario, Vi preghiamo cortesemente di
darcene notizia via e-mail e di procedere alla distruzione del messaggio
stesso, cancellandolo dal Vostro sistema. Conservare il messaggio stesso,
divulgarlo anche in parte, distribuirlo ad altri soggetti, copiarlo, od
utilizzarlo per finalità diverse, costituisce comportamento contrario ai
principi dettati dal D.Lgs. 196/2003.

The information in this message and/or attachments, is intended solely for
the attention and use of the named addressee(s) and may be confidential or
proprietary in nature or covered by the provisions of privacy act
(Legislative Decree June, 30 2003, no.196 - Italy's New Data Protection
Code).Any use not in accord with its purpose, any disclosure, reproduction,
copying, distribution, or either dissemination, either whole or partial, is
strictly forbidden except previous formal approval of the named
addressee(s). If you are not the intended recipient, please contact
immediately the sender by telephone, fax or e-mail and delete the
information in this message that has been received in error. The sender
does not give any warranty or accept liability as the content, accuracy or
completeness of sent messages and accepts no responsibility for changes
made after they were sent or for other risks which arise as a result of
e-mail transmission, viruses, etc.

-------------------------------------------------------