#1105: postgis query too large
---------------------+------------------------------------------------------
Reporter: plcking | Owner: geonetwork-devel@…
Type: defect | Status: new
Priority: major | Milestone: v2.9.0
Component: General | Version: v2.6.4
Keywords: |
---------------------+------------------------------------------------------
The query that is being sent to postgres/postgis from geonetwork(as a
result of a csw spatial query) has over 50,000 "OR" clauses(the query
just "hangs") - here is the postgres log :
LOG: execute fetch from S_4/C_5: SELECT "fid", asText("the_geom") FROM
"public"."spatialindex" WHERE ("fid" = '10') OR ("fid" = '100') OR ("fid"
= '1000') OR ("fid" = '10001') OR ("fid" = '10005') OR ("fid" = '10011')
OR ("fid" = '10014') OR ("fid" = '10015') OR ("fid" = '10018') OR ("fid" =
'10020') OR ("fid" = '10021') OR ("fid" = '10023') OR ("fid" = '10025') OR
("fid" = '10027') OR ("fid" = '10032') OR ("fid" = '10038') OR ("fid" =
'10041') OR ("fid" = '10042') OR ("fid" = '10044') OR ("fid" = '10045') OR
("fid" = '10047') OR ("fid" = '10048') OR ("fid" = '1005') OR ("fid" =
'10050') OR ("fid" = '10051') OR ("fid" = '10052') OR ("fid" = '10053') OR
("fid" = '10054') OR ("fid" = '10055') OR ("fid" = '10058') OR ("fid" =
'10061') OR ("fid" = '10062') OR ("fid" = '10063') OR ("fid" = '10066') OR
("fid" = '10067') OR ("fid" = '1007') OR ("fid" = '10070') OR ("fid" =
'10076') OR ("fid" = '10080')........................
What can be done to fix the issue ?
Postgres seems to have a limit of 50,000 on the number of clauses(and this
makes sense from an efficiency standpoint) - I tested the query when there
were 49K clauses and it was successful.
Pat
--
Ticket URL: <http://trac.osgeo.org/geonetwork/ticket/1105>
GeoNetwork opensource Developer website <http://sourceforge.net/projects/geonetwork/>
GeoNetwork opensource is a standards based, Free and Open Source catalog application to manage spatially referenced resources through the web. It provides powerful metadata editing and search functions as well as an embedded interactive web map viewer. This website contains information related to the development of the software.
#1105: postgis query too large
---------------------+------------------------------------------------------
Reporter: plcking | Owner: geonetwork-devel@…
Type: defect | Status: new
Priority: major | Milestone: v2.9.0
Component: General | Version: v2.6.4
Keywords: |
---------------------+------------------------------------------------------
Comment(by jeichar):
Do you have a stack trace so I can see what series of calls are causing
this error?
--
Ticket URL: <http://trac.osgeo.org/geonetwork/ticket/1105#comment:1>
GeoNetwork opensource Developer website <http://sourceforge.net/projects/geonetwork/>
GeoNetwork opensource is a standards based, Free and Open Source catalog application to manage spatially referenced resources through the web. It provides powerful metadata editing and search functions as well as an embedded interactive web map viewer. This website contains information related to the development of the software.
#1105: postgis query too large
---------------------+------------------------------------------------------
Reporter: plcking | Owner: geonetwork-devel@…
Type: defect | Status: new
Priority: major | Milestone: v2.9.0
Component: General | Version: v2.6.4
Keywords: |
---------------------+------------------------------------------------------
Comment(by plcking):
Replying to [comment:1 jeichar]:
> Do you have a stack trace so I can see what series of calls are causing
this error?
The query jusy "hangs". Here is the output from geonetwork.log :
2012-10-09 10:55:58,036 INFO [jeeves.request] -
2012-10-09 10:55:58,036 INFO [jeeves.request] - HTML Request (from
132.156.10.107) : /geonetwork/srv/en/csw
2012-10-09 10:55:58,036 DEBUG [jeeves.request] - Method : POST
2012-10-09 10:55:58,036 DEBUG [jeeves.request] - Content type : text/xml
2012-10-09 10:55:58,036 DEBUG [jeeves.request] - Accept : */*
2012-10-09 10:55:58,036 DEBUG [jeeves.request] - Session id is
3BB6476D1ABB2409C67968E8F52F185D
2012-10-09 10:55:58,036 DEBUG [jeeves.request] - Session created for
client : 132.156.10.107
2012-10-09 10:55:58,038 INFO [jeeves.service] - Dispatching : csw
2012-10-09 10:55:58,038 DEBUG [jeeves.service] - -> parameters are :
<csw:GetRecords xmlns:csw="http://www.opengis.net/cat/csw/2.0.2"
xmlns:ogc="http://www.opengis.net/ogc"
xmlns:gml="http://www.opengis.net/gml
" service="CSW" version="2.0.2" resultType="results"
outputSchema="http://www.isotc211.org/2005/gmd" maxRecords="100">
<csw:Query typeNames="gmd:MD_Metadata">
<csw:ElementSetName>full</csw:ElementSetName>
<csw:Constraint version="1.1.0">
<ogc:Filter>
<ogc:Intersects>
<ogc:PropertyName>BoundingBox</ogc:PropertyName>
<gml:Envelope>
<gml:lowerCorner>-44.00 -8.38</gml:lowerCorner>
<gml:upperCorner>-40.00 1.00</gml:upperCorner>
</gml:Envelope>
</ogc:Intersects>
</ogc:Filter>
</csw:Constraint>
</csw:Query>
</csw:GetRecords>
2012-10-09 10:55:58,039 INFO [jeeves.webapp.csw] - Received:
<csw:GetRecords xmlns:csw="http://www.opengis.net/cat/csw/2.0.2"
xmlns:ogc="http://www.opengis.net/ogc"
xmlns:gml="http://www.opengis.net/gml
" service="CSW" version="2.0.2" resultType="results"
outputSchema="http://www.isotc211.org/2005/gmd" maxRecords="100">
<csw:Query typeNames="gmd:MD_Metadata">
<csw:ElementSetName>full</csw:ElementSetName>
<csw:Constraint version="1.1.0">
<ogc:Filter>
<ogc:Intersects>
<ogc:PropertyName>BoundingBox</ogc:PropertyName>
<gml:Envelope>
<gml:lowerCorner>-44.00 -8.38</gml:lowerCorner>
<gml:upperCorner>-40.00 1.00</gml:upperCorner>
</gml:Envelope>
</ogc:Intersects>
</ogc:Filter>
</csw:Constraint>
</csw:Query>
</csw:GetRecords>
2012-10-09 10:55:58,049 DEBUG [jeeves.engine] - TransformerFactoryFactory:
de.fzi.dbs.xml.transform.CachingTransformerFactory
2012-10-09 10:55:58,064 DEBUG [jeeves.engine] - TransformerFactoryFactory:
produces transformer implementation net.sf.saxon.IdentityTransform
er
Pat
--
Ticket URL: <http://trac.osgeo.org/geonetwork/ticket/1105#comment:2>
GeoNetwork opensource Developer website <http://sourceforge.net/projects/geonetwork/>
GeoNetwork opensource is a standards based, Free and Open Source catalog application to manage spatially referenced resources through the web. It provides powerful metadata editing and search functions as well as an embedded interactive web map viewer. This website contains information related to the development of the software.
#1105: postgis query too large
---------------------+------------------------------------------------------
Reporter: plcking | Owner: geonetwork-devel@…
Type: defect | Status: new
Priority: major | Milestone: v2.9.0
Component: General | Version: v2.6.4
Keywords: |
---------------------+------------------------------------------------------
Comment(by plcking):
Replying to [comment:2 plcking]:
> Replying to [comment:1 jeichar]:
> > Do you have a stack trace so I can see what series of calls are
causing this error?
>
> The query jusy "hangs". Here is the output from geonetwork.log :
>
>
> 2012-10-09 10:55:58,036 INFO [jeeves.request] -
> 2012-10-09 10:55:58,036 INFO [jeeves.request] - HTML Request (from
132.156.10.107) : /geonetwork/srv/en/csw
> 2012-10-09 10:55:58,036 DEBUG [jeeves.request] - Method : POST
> 2012-10-09 10:55:58,036 DEBUG [jeeves.request] - Content type : text/xml
> 2012-10-09 10:55:58,036 DEBUG [jeeves.request] - Accept : */*
> 2012-10-09 10:55:58,036 DEBUG [jeeves.request] - Session id is
3BB6476D1ABB2409C67968E8F52F185D
> 2012-10-09 10:55:58,036 DEBUG [jeeves.request] - Session created for
client : 132.156.10.107
> 2012-10-09 10:55:58,038 INFO [jeeves.service] - Dispatching : csw
> 2012-10-09 10:55:58,038 DEBUG [jeeves.service] - -> parameters are :
> <csw:GetRecords xmlns:csw="http://www.opengis.net/cat/csw/2.0.2"
xmlns:ogc="http://www.opengis.net/ogc"
xmlns:gml="http://www.opengis.net/gml
> " service="CSW" version="2.0.2" resultType="results"
outputSchema="http://www.isotc211.org/2005/gmd" maxRecords="100">
> <csw:Query typeNames="gmd:MD_Metadata">
> <csw:ElementSetName>full</csw:ElementSetName>
> <csw:Constraint version="1.1.0">
> <ogc:Filter>
> <ogc:Intersects>
> <ogc:PropertyName>BoundingBox</ogc:PropertyName>
> <gml:Envelope>
> <gml:lowerCorner>-44.00 -8.38</gml:lowerCorner>
> <gml:upperCorner>-40.00 1.00</gml:upperCorner>
> </gml:Envelope>
> </ogc:Intersects>
> </ogc:Filter>
> </csw:Constraint>
> </csw:Query>
> </csw:GetRecords>
> 2012-10-09 10:55:58,039 INFO [jeeves.webapp.csw] - Received:
> <csw:GetRecords xmlns:csw="http://www.opengis.net/cat/csw/2.0.2"
xmlns:ogc="http://www.opengis.net/ogc"
xmlns:gml="http://www.opengis.net/gml
> " service="CSW" version="2.0.2" resultType="results"
outputSchema="http://www.isotc211.org/2005/gmd" maxRecords="100">
> <csw:Query typeNames="gmd:MD_Metadata">
> <csw:ElementSetName>full</csw:ElementSetName>
> <csw:Constraint version="1.1.0">
> <ogc:Filter>
> <ogc:Intersects>
> <ogc:PropertyName>BoundingBox</ogc:PropertyName>
> <gml:Envelope>
> <gml:lowerCorner>-44.00 -8.38</gml:lowerCorner>
> <gml:upperCorner>-40.00 1.00</gml:upperCorner>
> </gml:Envelope>
> </ogc:Intersects>
> </ogc:Filter>
> </csw:Constraint>
> </csw:Query>
> </csw:GetRecords>
> 2012-10-09 10:55:58,049 DEBUG [jeeves.engine] -
TransformerFactoryFactory:
de.fzi.dbs.xml.transform.CachingTransformerFactory
> 2012-10-09 10:55:58,064 DEBUG [jeeves.engine] -
TransformerFactoryFactory: produces transformer implementation
net.sf.saxon.IdentityTransform
> er
>
>
> Pat
How does geonetwork construct the query ? Using so many feature Id's ?
Pat
--
Ticket URL: <http://trac.osgeo.org/geonetwork/ticket/1105#comment:3>
GeoNetwork opensource Developer website <http://sourceforge.net/projects/geonetwork/>
GeoNetwork opensource is a standards based, Free and Open Source catalog application to manage spatially referenced resources through the web. It provides powerful metadata editing and search functions as well as an embedded interactive web map viewer. This website contains information related to the development of the software.
#1105: postgis query too large
---------------------+------------------------------------------------------
Reporter: plcking | Owner: geonetwork-devel@…
Type: defect | Status: new
Priority: major | Milestone: v2.9.0
Component: General | Version: v2.6.4
Keywords: |
---------------------+------------------------------------------------------
Comment(by jeichar):
My understanding of what is happening is the following:
The envelope query is converted to a lucene query which returns the ID of
all the metadata in question. Then the ids together form a database
query. Obviously some checks on the number of ids need to be added and
multiple queries will be made.
I would have expected the maxRecords to reduce the number of clauses.
Sounds like it is not taken into account at the right time.
Are you using version 2.6.4 as the bug report indicates?
--
Ticket URL: <http://trac.osgeo.org/geonetwork/ticket/1105#comment:4>
GeoNetwork opensource Developer website <http://sourceforge.net/projects/geonetwork/>
GeoNetwork opensource is a standards based, Free and Open Source catalog application to manage spatially referenced resources through the web. It provides powerful metadata editing and search functions as well as an embedded interactive web map viewer. This website contains information related to the development of the software.
#1105: postgis query too large
---------------------+------------------------------------------------------
Reporter: plcking | Owner: geonetwork-devel@…
Type: defect | Status: new
Priority: major | Milestone: v2.9.0
Component: General | Version: v2.6.4
Keywords: |
---------------------+------------------------------------------------------
Comment(by plcking):
Replying to [comment:4 jeichar]:
> My understanding of what is happening is the following:
>
> The envelope query is converted to a lucene query which returns the ID
of all the metadata in question. Then the ids together form a database
query. Obviously some checks on the number of ids need to be added and
multiple queries will be made.
>
> I would have expected the maxRecords to reduce the number of clauses.
Sounds like it is not taken into account at the right time.
>
> Are you using version 2.6.4 as the bug report indicates?
>
Hi :
First of all, I'd like to thank you for your help so far. I really
appreciate it !
The number of records that I have imported is in excess of 400,000. I
never had the problem when the db had 330,000 records.
I am definitely using geonetwork v2.6.4.
Pat
--
Ticket URL: <http://trac.osgeo.org/geonetwork/ticket/1105#comment:5>
GeoNetwork opensource Developer website <http://sourceforge.net/projects/geonetwork/>
GeoNetwork opensource is a standards based, Free and Open Source catalog application to manage spatially referenced resources through the web. It provides powerful metadata editing and search functions as well as an embedded interactive web map viewer. This website contains information related to the development of the software.
We did come across this as well in building catalogs with a couple of million records. A work around is to increase the operating system stack size resources and max_stack_depth in postgis config as explained in the section of the users manual on advanced configuration. The section of the manual that explains how to do this is available online at:
http://geonetwork-opensource.org/manuals/trunk/eng/users/admin/advanced-configuration/index.html#advanced-configuration-for-larger-catalogs
Cheers,
Simon
________________________________________
From: GeoNetwork opensource Developer website [trac_osgeo@anonymised.com]
Sent: Wednesday, 10 October 2012 5:26 AM
Subject: Re: [GeoNetwork-devel] [GeoNetwork opensource Developer website] #1105: postgis query too large
#1105: postgis query too large
---------------------+------------------------------------------------------
Reporter: plcking | Owner: geonetwork-devel@…
Type: defect | Status: new
Priority: major | Milestone: v2.9.0
Component: General | Version: v2.6.4
Keywords: |
---------------------+------------------------------------------------------
Comment(by plcking):
Replying to [comment:4 jeichar]:
> My understanding of what is happening is the following:
>
> The envelope query is converted to a lucene query which returns the ID
of all the metadata in question. Then the ids together form a database
query. Obviously some checks on the number of ids need to be added and
multiple queries will be made.
>
> I would have expected the maxRecords to reduce the number of clauses.
Sounds like it is not taken into account at the right time.
>
> Are you using version 2.6.4 as the bug report indicates?
>
Hi :
First of all, I'd like to thank you for your help so far. I really
appreciate it !
The number of records that I have imported is in excess of 400,000. I
never had the problem when the db had 330,000 records.
I am definitely using geonetwork v2.6.4.
Pat
--
Ticket URL: <#1105 (postgis query too large) – GeoNetwork opensource Developer website;
GeoNetwork opensource Developer website <http://sourceforge.net/projects/geonetwork/>
GeoNetwork opensource is a standards based, Free and Open Source catalog application to manage spatially referenced resources through the web. It provides powerful metadata editing and search functions as well as an embedded interactive web map viewer. This website contains information related to the development of the software.
------------------------------------------------------------------------------
Don't let slow site performance ruin your business. Deploy New Relic APM
Deploy New Relic app performance management and know exactly
what is happening inside your Ruby, Python, PHP, Java, and .NET app
Try New Relic at no cost today and get our sweet Data Nerd shirt too!
Best Open Source Mac Front-Ends 2024
_______________________________________________
GeoNetwork-devel mailing list
GeoNetwork-devel@lists.sourceforge.net
geonetwork-devel List Signup and Options
GeoNetwork OpenSource is maintained at http://sourceforge.net/projects/geonetwork
How large should I set BooleanQuery.setMaxClauseCount to be?
Jesse
On Wed, Oct 10, 2012 at 2:16 PM, GeoNetwork opensource Developer website <trac_osgeo@anonymised.com> wrote:
#1105: postgis query too large
----------------------±----------------------------------------------------
Reporter: plcking | Owner: jeichar
Type: defect | Status: reopened
Priority: major | Milestone: v2.9.0
Component: General | Version: v2.6.4
Resolution: | Keywords:
----------------------±----------------------------------------------------
Comment(by plcking):
Replying to [comment:13 jeichar]:
If I make a new war for you can you test it to see if my fix works for
you?
Thank you for your help !!!
I think a war file would help, and I will re-edit(after tomcat war
expansion) the ./web/geonetwork/WEB-INF/config.xml to point to my postgis
db. Of special note is that I had to modify(when I had 300k records)
./WEB_INF/classes/org/fao/geonet/kernel/search/LuceneSearcher.java
(BooleanQuery.setMaxClauseCount had to be set to a large value due to a
“too many clauses” error). If you don’t include the change for this, I
have my CLASSPATH env var set to recompile the code “tomcat in-situ” (I
couldn’t build the distribution with maven due to numerous errors).
Pat
–
Ticket URL: <http://trac.osgeo.org/geonetwork/ticket/1105#comment:14>
GeoNetwork opensource Developer website <http://sourceforge.net/projects/geonetwork/>
GeoNetwork opensource is a standards based, Free and Open Source catalog application to manage spatially referenced resources through the web. It provides powerful metadata editing and search functions as well as an embedded interactive web map viewer. This website contains information related to the development of the software.