[Geoserver-devel] jdbcconfig performance

Hello list,

The jdbcconfig module seriously slows down the response time of requests. I discovered that it sends dozens of SQL queries to the db per request, even though jdbcconfig uses a cache.

Turns out that the cache is only by id, but most queries to the catalog are actually by name. For each ows request, the layer, workspace,namespace, resource, etc... are all queried from the catalog multiple times by name, resulting in many (duplicated) SQL queries.

So my suggestion would be to store catalog objects in the cache not only by ID, but also by CATALOGINFOTYPE::NAME or something like that. Then we can use the cache in all of the get[CATALOGINFOTYPE]ByName methods.

Are there any objections and/or suggestions to this proposal?

Regards

Niels

On Thu, Jun 15, 2017 at 3:49 PM, Niels Charlier <niels@anonymised.com> wrote:

Hello list,

The jdbcconfig module seriously slows down the response time of requests.
I discovered that it sends dozens of SQL queries to the db per request,
even though jdbcconfig uses a cache.

Turns out that the cache is only by id, but most queries to the catalog
are actually by name. For each ows request, the layer, workspace,namespace,
resource, etc... are all queried from the catalog multiple times by name,
resulting in many (duplicated) SQL queries.

Yes, all of the above is well known, it was discussed in January already
during as part of GSIP 155 (part by mail, part during the PSC meeting, part
on gitter).

So my suggestion would be to store catalog objects in the cache not only
by ID, but also by CATALOGINFOTYPE::NAME or something like that. Then we
can use the cache in all of the get[CATALOGINFOTYPE]ByName methods.

Are there any objections and/or suggestions to this proposal?

Nowadays there is from a cache keyed by immutables (ids), you are adding a
cache of mutables (I believe you need to cache by qualitified name, ws:
name), so how are you going to spread the word across cluster nodes when a
layer or workspace name changes, or gets removed? Is the hz-cluster
covering this already?

Cheers
Andrea

--

GeoServer Professional Services from the experts! Visit
http://goo.gl/it488V for more information.

Ing. Andrea Aime
@geowolf
Technical Lead

GeoSolutions S.A.S.
Via di Montramito 3/A
55054 Massarosa (LU)
phone: +39 0584 962313 <+39%200584%20962313>
fax: +39 0584 1660272 <+39%200584%20166%200272>
mob: +39 339 8844549 <+39%20339%20884%204549>

http://www.geo-solutions.it
http://twitter.com/geosolutions_it

*AVVERTENZE AI SENSI DEL D.Lgs. 196/2003*

Le informazioni contenute in questo messaggio di posta elettronica e/o
nel/i file/s allegato/i sono da considerarsi strettamente riservate. Il
loro utilizzo è consentito esclusivamente al destinatario del messaggio,
per le finalità indicate nel messaggio stesso. Qualora riceviate questo
messaggio senza esserne il destinatario, Vi preghiamo cortesemente di
darcene notizia via e-mail e di procedere alla distruzione del messaggio
stesso, cancellandolo dal Vostro sistema. Conservare il messaggio stesso,
divulgarlo anche in parte, distribuirlo ad altri soggetti, copiarlo, od
utilizzarlo per finalità diverse, costituisce comportamento contrario ai
principi dettati dal D.Lgs. 196/2003.

The information in this message and/or attachments, is intended solely for
the attention and use of the named addressee(s) and may be confidential or
proprietary in nature or covered by the provisions of privacy act
(Legislative Decree June, 30 2003, no.196 - Italy's New Data Protection
Code).Any use not in accord with its purpose, any disclosure, reproduction,
copying, distribution, or either dissemination, either whole or partial, is
strictly forbidden except previous formal approval of the named
addressee(s). If you are not the intended recipient, please contact
immediately the sender by telephone, fax or e-mail and delete the
information in this message that has been received in error. The sender
does not give any warranty or accept liability as the content, accuracy or
completeness of sent messages and accepts no responsibility for changes
made after they were sent or for other risks which arise as a result of
e-mail transmission, viruses, etc.

-------------------------------------------------------

Hello Andrea,

Indeed I missed this while searching for past discussions about jdbc performance. In the meantime I read your proposal, the email discussion, and had a look at the PR. That looks like great stuff. But it is not entirely clear to me what effect your changes have on the amount of name queries per request with jdbcconfig, are they significantly reduced? Perhaps my suggestion can still make a difference.

yes, hz-cluster already covers a joint cache. Upon a change, the item would be invalidated based on the old name and recached with the new name. Regards Niels

···

On 15-06-17 16:02, Andrea Aime wrote:

Nowadays there is from a cache keyed by immutables (ids), you are adding a cache of mutables (I believe you need to cache by qualitified name, ws: name), so how are you going to spread the word across cluster nodes when a layer or workspace name changes, or gets removed? Is the hz-cluster covering this already?

On Thu, Jun 15, 2017 at 4:23 PM, Niels Charlier <niels@anonymised.com> wrote:

Hello Andrea,

Indeed I missed this while searching for past discussions about jdbc
performance. In the meantime I read your proposal, the email discussion,
and had a look at the PR. That looks like great stuff. But it is not
entirely clear to me what effect your changes have on the amount of name
queries per request with jdbcconfig, are they significantly reduced?
Perhaps my suggestion can still make a difference.

I just investigated why JDBCConfig was so much slower than the latest
DefaultCatalogFacade when doing OGC request, and found the same (too many
queries to go from name to ID).
But GSIP 255 changes did not touch jdbcconfig at all.

On 15-06-17 16:02, Andrea Aime wrote:

Nowadays there is from a cache keyed by immutables (ids), you are adding a
cache of mutables (I believe you need to cache by qualitified name, ws:
name), so how are you going to spread the word across cluster nodes when a
layer or workspace name changes, or gets removed? Is the hz-cluster
covering this already?

yes, hz-cluster already covers a joint cache. Upon a change, the item
would be invalidated based on the old name and recached with the new name.

Then I guess you're good to go

Cheers
Andrea

--

GeoServer Professional Services from the experts! Visit
http://goo.gl/it488V for more information.

Ing. Andrea Aime
@geowolf
Technical Lead

GeoSolutions S.A.S.
Via di Montramito 3/A
55054 Massarosa (LU)
phone: +39 0584 962313
fax: +39 0584 1660272
mob: +39 339 8844549

http://www.geo-solutions.it
http://twitter.com/geosolutions_it

*AVVERTENZE AI SENSI DEL D.Lgs. 196/2003*

Le informazioni contenute in questo messaggio di posta elettronica e/o
nel/i file/s allegato/i sono da considerarsi strettamente riservate. Il
loro utilizzo è consentito esclusivamente al destinatario del messaggio,
per le finalità indicate nel messaggio stesso. Qualora riceviate questo
messaggio senza esserne il destinatario, Vi preghiamo cortesemente di
darcene notizia via e-mail e di procedere alla distruzione del messaggio
stesso, cancellandolo dal Vostro sistema. Conservare il messaggio stesso,
divulgarlo anche in parte, distribuirlo ad altri soggetti, copiarlo, od
utilizzarlo per finalità diverse, costituisce comportamento contrario ai
principi dettati dal D.Lgs. 196/2003.

The information in this message and/or attachments, is intended solely for
the attention and use of the named addressee(s) and may be confidential or
proprietary in nature or covered by the provisions of privacy act
(Legislative Decree June, 30 2003, no.196 - Italy's New Data Protection
Code).Any use not in accord with its purpose, any disclosure, reproduction,
copying, distribution, or either dissemination, either whole or partial, is
strictly forbidden except previous formal approval of the named
addressee(s). If you are not the intended recipient, please contact
immediately the sender by telephone, fax or e-mail and delete the
information in this message that has been received in error. The sender
does not give any warranty or accept liability as the content, accuracy or
completeness of sent messages and accepts no responsibility for changes
made after they were sent or for other risks which arise as a result of
e-mail transmission, viruses, etc.

-------------------------------------------------------

I understand that, but looking at the PR I thought maybe by reducing queries to the catalog in general, you had also reduced queries to the jdbcconfig database to some extent.

In any case, I didn’t mean to imply I was the first to discover this problem in jdbcconfig, I just happened to come across this as well while analysing the performance of a particular deployment.

Regards
Niels

···

I just investigated why JDBCConfig was so much slower than the latest DefaultCatalogFacade when doing OGC request, and found the same (too many queries to go from name to ID).
But GSIP 255 changes did not touch jdbcconfig at all.