[Geoserver-users] jdbcconfig - performance issue with 11K layers

Hi,

I’ve been trying out the jdbcconfig module with a data catalog of about 11,500 layers (~1000 geotiffs, a few hundred shapefiles, and the rest in a PostGIS datastore). Immediately after the initial import of the data catalog, the responsiveness of Geoserver seems fine, for instance when accessing the list of layers from the administration interface and paging through results. However, after Geoserver is restarted, or if the configuration/catalog is reloaded from the admin interface, then it becomes far slower - taking over 2 minutes to display each page of layers from the admin interface.

I connected to the code running in my Tomcat instance remotely via Eclipse and it seems that this line in the ConfigDatabase class is where it takes so long to process:
https://github.com/geoserver/geoserver/blob/master/src/community/jdbcconfig/src/main/java/org/geoserver/jdbcconfig/internal/ConfigDatabase.java#232

The same line returns quickly immediately after the initial catalog import. But I haven’t yet been able to figure out why. Could it be that I simply have some setting misconfigured, or that there’s something wrong with my data catalog?

-Matt

Looks more like a limitation of the implementation, just before the line you mention the code has:

LOGGER.fine(“Filter is not fully supported, doing scan of supported part to return the number of matches”);

So I expect work is required to ensure that database can handle the query. Are you using the default H2 database or PostgreSQL?

You can see where a similar issue has been previously fixed on GEOS-5968.

···

Jody Garnett

On Wed, Mar 12, 2014 at 4:09 AM, Matt Bertrand <mbertrand@anonymised.com> wrote:

Hi,

I’ve been trying out the jdbcconfig module with a data catalog of about 11,500 layers (~1000 geotiffs, a few hundred shapefiles, and the rest in a PostGIS datastore). Immediately after the initial import of the data catalog, the responsiveness of Geoserver seems fine, for instance when accessing the list of layers from the administration interface and paging through results. However, after Geoserver is restarted, or if the configuration/catalog is reloaded from the admin interface, then it becomes far slower - taking over 2 minutes to display each page of layers from the admin interface.

I connected to the code running in my Tomcat instance remotely via Eclipse and it seems that this line in the ConfigDatabase class is where it takes so long to process:
https://github.com/geoserver/geoserver/blob/master/src/community/jdbcconfig/src/main/java/org/geoserver/jdbcconfig/internal/ConfigDatabase.java#232

The same line returns quickly immediately after the initial catalog import. But I haven’t yet been able to figure out why. Could it be that I simply have some setting misconfigured, or that there’s something wrong with my data catalog?

-Matt


Learn Graph Databases - Download FREE O’Reilly Book
“Graph Databases” is the definitive new guide to graph databases and their
applications. Written by three acclaimed leaders in the field,
this first edition is now available. Download your free book today!
http://p.sf.net/sfu/13534_NeoTech


Geoserver-users mailing list
Geoserver-users@anonymised.comsts.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/geoserver-users

Thanks Jody,

I have tried using both the PostgreSQL and H2 database, with the same result. And what's odd is that the same line of code is executed when Geoserver initially responds quickly as when it later responds slowly. Maybe it's a caching issue?

-Matt

On 03/11/2014 09:41 PM, Jody Garnett wrote:

Looks more like a limitation of the implementation, just before the line you mention the code has:

   LOGGER.fine("Filter is not fully supported, doing scan of supported part to return the number of matches");

So I expect work is required to ensure that database can handle the query. Are you using the default H2 database or PostgreSQL?

You can see where a similar issue has been previously fixed on GEOS-5968 <https://urldefense.proofpoint.com/v1/url?u=http://jira.codehaus.org/browse/GEOS-5968&k=AjZjj3dyY74kKL92lieHqQ%3D%3D &r=E51gnZ%2BcyXbMGQZxn%2FD1gw7E4%2FG%2Bn9A8lGzpPjSUKD4%3D &m=QuiunhjzwhnjewJu0voBPL5xU9BTeRCly%2FT0GlLQuMU%3D &s=400dfc1485c4e3d5c7a32c02aefc08aff7221b25096fa8dda0ddbf0683eab7b1&gt;\.

Jody Garnett

On Wed, Mar 12, 2014 at 4:09 AM, Matt Bertrand <mbertrand@anonymised.com <mailto:mbertrand@anonymised.com>> wrote:

    Hi,

    I've been trying out the jdbcconfig module with a data catalog of
    about 11,500 layers (~1000 geotiffs, a few hundred shapefiles, and
    the rest in a PostGIS datastore). Immediately after the initial
    import of the data catalog, the responsiveness of Geoserver seems
    fine, for instance when accessing the list of layers from the
    administration interface and paging through results. However,
    after Geoserver is restarted, or if the configuration/catalog is
    reloaded from the admin interface, then it becomes far slower -
    taking over 2 minutes to display each page of layers from the
    admin interface.

    I connected to the code running in my Tomcat instance remotely via
    Eclipse and it seems that this line in the ConfigDatabase class is
    where it takes so long to process:
    https://github.com/geoserver/geoserver/blob/master/src/community/jdbcconfig/src/main/java/org/geoserver/jdbcconfig/internal/ConfigDatabase.java#232
    <https://urldefense.proofpoint.com/v1/url?u=https://github.com/geoserver/geoserver/blob/master/src/community/jdbcconfig/src/main/java/org/geoserver/jdbcconfig/internal/ConfigDatabase.java%23232&k=AjZjj3dyY74kKL92lieHqQ%3D%3D &r=E51gnZ%2BcyXbMGQZxn%2FD1gw7E4%2FG%2Bn9A8lGzpPjSUKD4%3D &m=QuiunhjzwhnjewJu0voBPL5xU9BTeRCly%2FT0GlLQuMU%3D &s=e14c7842550b0d024558f47f1595a50fe162c1ab34f1a03feb72249a8ba1c2a6&gt;

    The same line returns quickly immediately after the initial
    catalog import. But I haven't yet been able to figure out why. Could it be that I simply have some setting misconfigured, or that
    there's something wrong with my data catalog?

    -Matt

    ------------------------------------------------------------------------------
    Learn Graph Databases - Download FREE O'Reilly Book
    "Graph Databases" is the definitive new guide to graph databases
    and their
    applications. Written by three acclaimed leaders in the field,
    this first edition is now available. Download your free book today!
    http://p.sf.net/sfu/13534_NeoTech
    <https://urldefense.proofpoint.com/v1/url?u=http://p.sf.net/sfu/13534_NeoTech&k=AjZjj3dyY74kKL92lieHqQ%3D%3D &r=E51gnZ%2BcyXbMGQZxn%2FD1gw7E4%2FG%2Bn9A8lGzpPjSUKD4%3D &m=QuiunhjzwhnjewJu0voBPL5xU9BTeRCly%2FT0GlLQuMU%3D &s=c43f27d2e71609b9136c7f688669205c45674deb105b9cd86af257d51c971fac&gt;
    _______________________________________________
    Geoserver-users mailing list
    Geoserver-users@lists.sourceforge.net
    <mailto:Geoserver-users@lists.sourceforge.net>
    https://lists.sourceforge.net/lists/listinfo/geoserver-users
    <https://urldefense.proofpoint.com/v1/url?u=https://lists.sourceforge.net/lists/listinfo/geoserver-users&k=AjZjj3dyY74kKL92lieHqQ%3D%3D &r=E51gnZ%2BcyXbMGQZxn%2FD1gw7E4%2FG%2Bn9A8lGzpPjSUKD4%3D &m=QuiunhjzwhnjewJu0voBPL5xU9BTeRCly%2FT0GlLQuMU%3D &s=bc1ad5fdfeaa1f69a89321831ced0de3ef86a20b51c1a4afe9d4fc387b8b30d6&gt;

Well if you are into looking at code - perhaps stop it in a debugger and see what filter it is having trouble optimising?

At the very least we can use that information to report a new JIRA issue.

···

Jody Garnett

On Thu, Mar 13, 2014 at 7:00 AM, Matt Bertrand <mbertrand@anonymised.com> wrote:

Thanks Jody,

I have tried using both the PostgreSQL and H2 database, with the same result. And what’s odd is that the same line of code is executed when Geoserver initially responds quickly as when it later responds slowly. Maybe it’s a caching issue?

-Matt

On 03/11/2014 09:41 PM, Jody Garnett wrote:

Looks more like a limitation of the implementation, just before the line you mention the code has:

LOGGER.fine(“Filter is not fully supported, doing scan of supported part to return the number of matches”);

So I expect work is required to ensure that database can handle the query. Are you using the default H2 database or PostgreSQL?

You can see where a similar issue has been previously fixed on GEOS-5968.

Jody Garnett

On Wed, Mar 12, 2014 at 4:09 AM, Matt Bertrand <mbertrand@anonymised.com> wrote:

Hi,

I’ve been trying out the jdbcconfig module with a data catalog of about 11,500 layers (~1000 geotiffs, a few hundred shapefiles, and the rest in a PostGIS datastore). Immediately after the initial import of the data catalog, the responsiveness of Geoserver seems fine, for instance when accessing the list of layers from the administration interface and paging through results. However, after Geoserver is restarted, or if the configuration/catalog is reloaded from the admin interface, then it becomes far slower - taking over 2 minutes to display each page of layers from the admin interface.

I connected to the code running in my Tomcat instance remotely via Eclipse and it seems that this line in the ConfigDatabase class is where it takes so long to process:
https://github.com/geoserver/geoserver/blob/master/src/community/jdbcconfig/src/main/java/org/geoserver/jdbcconfig/internal/ConfigDatabase.java#232

The same line returns quickly immediately after the initial catalog import. But I haven’t yet been able to figure out why. Could it be that I simply have some setting misconfigured, or that there’s something wrong with my data catalog?

-Matt


Learn Graph Databases - Download FREE O’Reilly Book
“Graph Databases” is the definitive new guide to graph databases and their
applications. Written by three acclaimed leaders in the field,
this first edition is now available. Download your free book today!
http://p.sf.net/sfu/13534_NeoTech


Geoserver-users mailing list
Geoserver-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/geoserver-users

It's an "IsEqualsToImpl" filter - same type of filter before and after the slowdown. I created a new JIRA issue for this - https://jira.codehaus.org/browse/GEOS-6398, just wanted to run the problem by this list first though to see if it might be a known issue caused by a misconfigured setting somewhere.

-Matt

On 03/12/2014 10:02 PM, Jody Garnett wrote:

Well if you are into looking at code - perhaps stop it in a debugger and see what filter it is having trouble optimising?

At the very least we can use that information to report a new JIRA issue.

Jody Garnett

On Thu, Mar 13, 2014 at 7:00 AM, Matt Bertrand <mbertrand@anonymised.com <mailto:mbertrand@anonymised.com>> wrote:

    Thanks Jody,

    I have tried using both the PostgreSQL and H2 database, with the
    same result. And what's odd is that the same line of code is
    executed when Geoserver initially responds quickly as when it
    later responds slowly. Maybe it's a caching issue?

    -Matt

    On 03/11/2014 09:41 PM, Jody Garnett wrote:

    Looks more like a limitation of the implementation, just before
    the line you mention the code has:

       LOGGER.fine("Filter is not fully supported, doing scan of
    supported part to return the number of matches");

    So I expect work is required to ensure that database can handle
    the query. Are you using the default H2 database or PostgreSQL?

    You can see where a similar issue has been previously fixed on
    GEOS-5968
    <https://urldefense.proofpoint.com/v1/url?u=http://jira.codehaus.org/browse/GEOS-5968&k=AjZjj3dyY74kKL92lieHqQ%3D%3D &r=E51gnZ%2BcyXbMGQZxn%2FD1gw7E4%2FG%2Bn9A8lGzpPjSUKD4%3D &m=QuiunhjzwhnjewJu0voBPL5xU9BTeRCly%2FT0GlLQuMU%3D &s=400dfc1485c4e3d5c7a32c02aefc08aff7221b25096fa8dda0ddbf0683eab7b1&gt;\.

    Jody Garnett

    On Wed, Mar 12, 2014 at 4:09 AM, Matt Bertrand
    <mbertrand@anonymised.com <mailto:mbertrand@anonymised.com>> wrote:

        Hi,

        I've been trying out the jdbcconfig module with a data
        catalog of about 11,500 layers (~1000 geotiffs, a few hundred
        shapefiles, and the rest in a PostGIS datastore). Immediately
        after the initial import of the data catalog, the
        responsiveness of Geoserver seems fine, for instance when
        accessing the list of layers from the administration
        interface and paging through results. However, after
        Geoserver is restarted, or if the configuration/catalog is
        reloaded from the admin interface, then it becomes far slower
        - taking over 2 minutes to display each page of layers from
        the admin interface.

        I connected to the code running in my Tomcat instance
        remotely via Eclipse and it seems that this line in the
        ConfigDatabase class is where it takes so long to process:
        https://github.com/geoserver/geoserver/blob/master/src/community/jdbcconfig/src/main/java/org/geoserver/jdbcconfig/internal/ConfigDatabase.java#232
        <https://urldefense.proofpoint.com/v1/url?u=https://github.com/geoserver/geoserver/blob/master/src/community/jdbcconfig/src/main/java/org/geoserver/jdbcconfig/internal/ConfigDatabase.java%23232&k=AjZjj3dyY74kKL92lieHqQ%3D%3D &r=E51gnZ%2BcyXbMGQZxn%2FD1gw7E4%2FG%2Bn9A8lGzpPjSUKD4%3D &m=QuiunhjzwhnjewJu0voBPL5xU9BTeRCly%2FT0GlLQuMU%3D &s=e14c7842550b0d024558f47f1595a50fe162c1ab34f1a03feb72249a8ba1c2a6&gt;

        The same line returns quickly immediately after the initial
        catalog import. But I haven't yet been able to figure out
        why. Could it be that I simply have some setting
        misconfigured, or that there's something wrong with my data
        catalog?

        -Matt

        ------------------------------------------------------------------------------
        Learn Graph Databases - Download FREE O'Reilly Book
        "Graph Databases" is the definitive new guide to graph
        databases and their
        applications. Written by three acclaimed leaders in the field,
        this first edition is now available. Download your free book
        today!
        http://p.sf.net/sfu/13534_NeoTech
        <https://urldefense.proofpoint.com/v1/url?u=http://p.sf.net/sfu/13534_NeoTech&k=AjZjj3dyY74kKL92lieHqQ%3D%3D &r=E51gnZ%2BcyXbMGQZxn%2FD1gw7E4%2FG%2Bn9A8lGzpPjSUKD4%3D &m=QuiunhjzwhnjewJu0voBPL5xU9BTeRCly%2FT0GlLQuMU%3D &s=c43f27d2e71609b9136c7f688669205c45674deb105b9cd86af257d51c971fac&gt;
        _______________________________________________
        Geoserver-users mailing list
        Geoserver-users@lists.sourceforge.net
        <mailto:Geoserver-users@lists.sourceforge.net>
        https://lists.sourceforge.net/lists/listinfo/geoserver-users
        <https://urldefense.proofpoint.com/v1/url?u=https://lists.sourceforge.net/lists/listinfo/geoserver-users&k=AjZjj3dyY74kKL92lieHqQ%3D%3D &r=E51gnZ%2BcyXbMGQZxn%2FD1gw7E4%2FG%2Bn9A8lGzpPjSUKD4%3D &m=QuiunhjzwhnjewJu0voBPL5xU9BTeRCly%2FT0GlLQuMU%3D &s=bc1ad5fdfeaa1f69a89321831ced0de3ef86a20b51c1a4afe9d4fc387b8b30d6&gt;

On Thu, Mar 13, 2014 at 4:56 PM, Matt Bertrand <mbertrand@anonymised.com>wrote:

It's an "IsEqualsToImpl" filter - same type of filter before and after
the slowdown. I created a new JIRA issue for this -
https://jira.codehaus.org/browse/GEOS-6398, just wanted to run the
problem by this list first though to see if it might be a known issue
caused by a misconfigured setting somewhere.

It's an equals, but inside it's using a custom function, not surprised it
cannot be optimized, that function needs to
the transformed into some SQL (which I believe might be doable, the
function internally is just checking
if the layer is in a certain workspace, or not?)

Cheers
Andrea

--

Meet us at GEO Business 2014! in London! Visit http://goo.gl/fES3aK
for more information.

Ing. Andrea Aime
@geowolf
Technical Lead

GeoSolutions S.A.S.
Via Poggio alle Viti 1187
55054 Massarosa (LU)
Italy
phone: +39 0584 962313
fax: +39 0584 1660272
mob: +39 339 8844549

http://www.geo-solutions.it
http://twitter.com/geosolutions_it

-------------------------------------------------------

The function is checking if the the Advertised and Enabled flags are set on the layer, or its resource, or for layer groups it’s either always returning true or checking each sublayer depending on the specified visibility policy. It also checks if the request is a GetCapabilities request and only does something what it is a GetCap. I’ve posted in the dev list about the details. The particular page at issue here is an easy fix, but some other pages are much more complicated.

···

On 13 March 2014 09:11, Andrea Aime <andrea.aime@anonymised.com> wrote:


Learn Graph Databases - Download FREE O’Reilly Book
“Graph Databases” is the definitive new guide to graph databases and their
applications. Written by three acclaimed leaders in the field,
this first edition is now available. Download your free book today!
http://p.sf.net/sfu/13534_NeoTech


Geoserver-users mailing list
Geoserver-users@anonymised.comsts.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/geoserver-users

Kevin Smith

Junior Software Engineer | Boundless

ksmith@anonymised.com

+1-778-785-7459

@boundlessgeo

On Thu, Mar 13, 2014 at 4:56 PM, Matt Bertrand <mbertrand@anonymised.com> wrote:

It’s an “IsEqualsToImpl” filter - same type of filter before and after the slowdown. I created a new JIRA issue for this - https://jira.codehaus.org/browse/GEOS-6398, just wanted to run the problem by this list first though to see if it might be a known issue caused by a misconfigured setting somewhere.

It’s an equals, but inside it’s using a custom function, not surprised it cannot be optimized, that function needs to
the transformed into some SQL (which I believe might be doable, the function internally is just checking
if the layer is in a certain workspace, or not?)

Cheers
Andrea

==
Meet us at GEO Business 2014! in London! Visit http://goo.gl/fES3aK
for more information.

Ing. Andrea Aime

@geowolf
Technical Lead

GeoSolutions S.A.S.
Via Poggio alle Viti 1187
55054 Massarosa (LU)
Italy
phone: +39 0584 962313
fax: +39 0584 1660272
mob: +39 339 8844549

http://www.geo-solutions.it
http://twitter.com/geosolutions_it