[GeoNetwork-users] CSW filters with gmd:denominator

Dear list,

I have been having problems (in GN 2.6.1 Linux) with CSW queries with the queryable parameter "Denominator". I have metadata for 100+ maps which I want to display; some of them cover the whole area of the Antarctic continent, so I want to be able to use a range-type query on the scale denominator in order to filter these out. Here is the XML I am using:

<csw:GetRecords xmlns:csw="http://www.opengis.net/cat/csw/2.0.2&quot;
service="CSW" version="2.0.2" maxRecords="150" startPosition="1"
resultType="results" outputSchema="csw:IsoRecord">
  <csw:Query typeNames="gmd:MD_Metadata">
    <csw:ElementSetName>full</csw:ElementSetName>
    <csw:Constraint version="1.1.0">
      <ogc:Filter xmlns:ogc="http://www.opengis.net/ogc&quot;&gt;
          <ogc:PropertyIsLessThan>
            <ogc:PropertyName>Denominator</ogc:PropertyName>
<ogc:Literal>200000</ogc:Literal>
          </ogc:PropertyIsLessThan>
      </ogc:Filter>
    </csw:Constraint>
    <ogc:SortBy xmlns:ogc="http://www.opengis.net/ogc&quot;&gt;
      <ogc:SortProperty>
        <ogc:PropertyName>Denominator</ogc:PropertyName>
        <ogc:SortOrder>ASC</ogc:SortOrder>
      </ogc:SortProperty>
    </ogc:SortBy>
  </csw:Query>
</csw:GetRecords>

It seems like "Denominator" and "gmd:denominator" are interchangeable BTW - result is the same. A typical snippet of my metadata looks like:

<gmd:spatialResolution>
        <gmd:MD_Resolution>
          <gmd:equivalentScale>
            <gmd:MD_RepresentativeFraction>
              <gmd:denominator>
                <gco:Integer>200000</gco:Integer>
              </gmd:denominator>
            </gmd:MD_RepresentativeFraction>
          </gmd:equivalentScale>
        </gmd:MD_Resolution>
      </gmd:spatialResolution>

What I observe is that when Lucene makes the query, it seems to be making the comparison as a string rather than as a number, despite the denominator obviously being a number as it's surrounded by <gco:Integer>. So the above query (denominator < 200000) returns all denominator values like 2000, 10000, 100000, 1000000, 10000000, consistent with a string comparison (alphabetic less than, not numeric) being performed. Below is the debug output I get from GN and Lucene. I can't see anything wrong in constructing the query.

2011-05-11 13:01:29,500 INFO [jeeves.webapp.csw] - Received:
<csw:GetRecords xmlns:csw="http://www.opengis.net/cat/csw/2.0.2&quot; service="CSW" version="2.0.2" maxRecords="150" startPosition="1" resultType="results" outputSchema="csw:IsoRecord">
  <csw:Query typeNames="gmd:MD_Metadata">
    <csw:ElementSetName>full</csw:ElementSetName>
    <csw:Constraint version="1.1.0">
      <ogc:Filter xmlns:ogc="http://www.opengis.net/ogc&quot;&gt;
        <ogc:PropertyIsLessThan>
          <ogc:PropertyName>Denominator</ogc:PropertyName>
          <ogc:Literal>200000</ogc:Literal>
        </ogc:PropertyIsLessThan>
      </ogc:Filter>
    </csw:Constraint>
    <ogc:SortBy xmlns:ogc="http://www.opengis.net/ogc&quot;&gt;
      <ogc:SortProperty>
        <ogc:PropertyName>Denominator</ogc:PropertyName>
        <ogc:SortOrder>ASC</ogc:SortOrder>
      </ogc:SortProperty>
    </ogc:SortBy>
  </csw:Query>
</csw:GetRecords>
2011-05-11 13:01:29,500 INFO [geonetwork.csw] - Dispatching operation : GetRecords
2011-05-11 13:01:29,500 INFO [geonetwork.csw] - Dispatching operation : GetRecords
2011-05-11 13:01:29,501 DEBUG [geonetwork.search] - Sorting by : [denominator,false]
2011-05-11 13:01:29,501 DEBUG [geonetwork.search] - Sorting by : [denominator,false]
2011-05-11 13:01:29,507 INFO [geonetwork.csw.search] - filterToLucene result:
<BooleanQuery>
  <BooleanClause required="true" prohibited="false">
    <RangeQuery fld="Denominator" upperTxt="200000" inclusive="false" />
  </BooleanClause>
  <BooleanClause required="true" prohibited="false">
    <TermQuery fld="_isTemplate" txt="n" />
  </BooleanClause>
</BooleanQuery>
2011-05-11 13:01:29,507 INFO [geonetwork.csw.search] - filterToLucene result:
<BooleanQuery>
  <BooleanClause required="true" prohibited="false">
    <RangeQuery fld="Denominator" upperTxt="200000" inclusive="false" />
  </BooleanClause>
  <BooleanClause required="true" prohibited="false">
    <TermQuery fld="_isTemplate" txt="n" />
  </BooleanClause>
</BooleanQuery>
2011-05-11 13:01:29,507 INFO [geonetwork.csw.search] - Unknown queryable field : _isTemplate
2011-05-11 13:01:29,507 INFO [geonetwork.csw.search] - Unknown queryable field : _isTemplate
2011-05-11 13:01:29,507 DEBUG [geonetwork.csw.search] - Query changed, reopening IndexReader
2011-05-11 13:01:29,507 DEBUG [geonetwork.csw.search] - Query changed, reopening IndexReader
2011-05-11 13:01:29,529 DEBUG [geonetwork.csw.search] - Search criteria:
<BooleanQuery>
  <BooleanClause required="true" prohibited="false">
    <RangeQuery fld="denominator" upperTxt="200000" inclusive="false" />
  </BooleanClause>
  <BooleanClause required="true" prohibited="false">
    <TermQuery fld="_isTemplate" txt="n" />
  </BooleanClause>
</BooleanQuery>
2011-05-11 13:01:29,529 DEBUG [geonetwork.csw.search] - Search criteria:
<BooleanQuery>
  <BooleanClause required="true" prohibited="false">
    <RangeQuery fld="denominator" upperTxt="200000" inclusive="false" />
  </BooleanClause>
  <BooleanClause required="true" prohibited="false">
    <TermQuery fld="_isTemplate" txt="n" />
  </BooleanClause>
</BooleanQuery>
2011-05-11 13:01:29,529 DEBUG [geonetwork.csw.search] - Search criteria:
<BooleanQuery>
  <BooleanClause required="true" prohibited="false">
    <RangeQuery fld="denominator" upperTxt="200000" inclusive="false" />
  </BooleanClause>
  <BooleanClause required="true" prohibited="false">
    <TermQuery fld="_isTemplate" txt="n" />
  </BooleanClause>
</BooleanQuery>
2011-05-11 13:01:29,530 DEBUG [geonetwork.search] - Analyze field denominator : 200000
2011-05-11 13:01:29,530 DEBUG [geonetwork.search] - Analyze field denominator : 200000
2011-05-11 13:01:29,530 DEBUG [geonetwork.search] - Lucene Query: denominator:{* TO 200000}
2011-05-11 13:01:29,530 DEBUG [geonetwork.search] - Lucene Query: denominator:{* TO 200000}
2011-05-11 13:01:29,530 DEBUG [geonetwork.search] - Analyze field _isTemplate : n
2011-05-11 13:01:29,530 DEBUG [geonetwork.search] - Analyze field _isTemplate : n
2011-05-11 13:01:29,530 DEBUG [geonetwork.search] - Lucene Query: _isTemplate:n
2011-05-11 13:01:29,530 DEBUG [geonetwork.search] - Lucene Query: _isTemplate:n
2011-05-11 13:01:29,530 DEBUG [geonetwork.search] - Lucene Query: +denominator:{* TO 200000} +_isTemplate:n
2011-05-11 13:01:29,530 DEBUG [geonetwork.search] - Lucene Query: +denominator:{* TO 200000} +_isTemplate:n
2011-05-11 13:01:29,530 INFO [geonetwork.csw.search] - LuceneSearcher made query:
+denominator:{* TO 200000} +_isTemplate:n
2011-05-11 13:01:29,530 INFO [geonetwork.csw.search] - LuceneSearcher made query:
+denominator:{* TO 200000} +_isTemplate:n
2011-05-11 13:01:29,530 ERROR [jeeves.dbmspool] - reconnecting: 3573403>=900000 ms since last connection
2011-05-11 13:01:29,561 DEBUG [jeeves.dbms.select] - Query : SELECT id FROM Groups
2011-05-11 13:01:29,568 DEBUG [jeeves.dbms.select] - Found 7 records in 0.0070 secs
2011-05-11 13:01:29,568 DEBUG [geonetwork.csw.search] - Lucene query: +(+denominator:{* TO 200000} +_isTemplate:n) +(_op0:3 _op0:1 _op0:0 _op0:6 _op0:-1 _op0:5 _op0:4 _owner:1)
2011-05-11 13:01:29,568 DEBUG [geonetwork.csw.search] - Lucene query: +(+denominator:{* TO 200000} +_isTemplate:n) +(_op0:3 _op0:1 _op0:0 _op0:6 _op0:-1 _op0:5 _op0:4 _owner:1)

I am running GN 2.6.1, and I do notice from the release notes that #422 addresses an issue with PropertyIsLessThan. I could upgrade if this issue has been fixed in 2.6.2/3. I do see the same kind of behaviour with PropertyIsGreaterThan and PropertyIsBetween though , so what I'd like to know is - how do I force GN (or Lucene) to make this filter comparison by treating the <gco:Integer> quantity as a number rather than a string?

Thanks in advance for any help,

David Herbert
British Antarctic Survey.

--
This message (and any attachments) is for the recipient only. NERC
is subject to the Freedom of Information Act 2000 and the contents
of this email and any reply you make may be disclosed by NERC unless
it is exempt from release under the Act. Any material supplied to
NERC may be stored in an electronic records management system.

Hello David,
numeric indexing has been added to trunk for 2.7.x release (including
denominator search field).
See http://trac.osgeo.org/geonetwork/ticket/382

Cheers.

Francois

2011/5/11 Herbert, David J. <darb1@anonymised.com>:

Dear list,

I have been having problems (in GN 2.6.1 Linux) with CSW queries with the queryable parameter "Denominator". I have metadata for 100+ maps which I want to display; some of them cover the whole area of the Antarctic continent, so I want to be able to use a range-type query on the scale denominator in order to filter these out. Here is the XML I am using:

<csw:GetRecords xmlns:csw="http://www.opengis.net/cat/csw/2.0.2&quot;
service="CSW" version="2.0.2" maxRecords="150" startPosition="1"
resultType="results" outputSchema="csw:IsoRecord">
<csw:Query typeNames="gmd:MD_Metadata">
<csw:ElementSetName>full</csw:ElementSetName>
<csw:Constraint version="1.1.0">
<ogc:Filter xmlns:ogc="http://www.opengis.net/ogc&quot;&gt;
<ogc:PropertyIsLessThan>
<ogc:PropertyName>Denominator</ogc:PropertyName>
<ogc:Literal>200000</ogc:Literal>
</ogc:PropertyIsLessThan>
</ogc:Filter>
</csw:Constraint>
<ogc:SortBy xmlns:ogc="http://www.opengis.net/ogc&quot;&gt;
<ogc:SortProperty>
<ogc:PropertyName>Denominator</ogc:PropertyName>
<ogc:SortOrder>ASC</ogc:SortOrder>
</ogc:SortProperty>
</ogc:SortBy>
</csw:Query>
</csw:GetRecords>

It seems like "Denominator" and "gmd:denominator" are interchangeable BTW - result is the same. A typical snippet of my metadata looks like:

<gmd:spatialResolution>
<gmd:MD_Resolution>
<gmd:equivalentScale>
<gmd:MD_RepresentativeFraction>
<gmd:denominator>
<gco:Integer>200000</gco:Integer>
</gmd:denominator>
</gmd:MD_RepresentativeFraction>
</gmd:equivalentScale>
</gmd:MD_Resolution>
</gmd:spatialResolution>

What I observe is that when Lucene makes the query, it seems to be making the comparison as a string rather than as a number, despite the denominator obviously being a number as it's surrounded by <gco:Integer>. So the above query (denominator < 200000) returns all denominator values like 2000, 10000, 100000, 1000000, 10000000, consistent with a string comparison (alphabetic less than, not numeric) being performed. Below is the debug output I get from GN and Lucene. I can't see anything wrong in constructing the query.

2011-05-11 13:01:29,500 INFO [jeeves.webapp.csw] - Received:
<csw:GetRecords xmlns:csw="http://www.opengis.net/cat/csw/2.0.2&quot; service="CSW" version="2.0.2" maxRecords="150" startPosition="1" resultType="results" outputSchema="csw:IsoRecord">
<csw:Query typeNames="gmd:MD_Metadata">
<csw:ElementSetName>full</csw:ElementSetName>
<csw:Constraint version="1.1.0">
<ogc:Filter xmlns:ogc="http://www.opengis.net/ogc&quot;&gt;
<ogc:PropertyIsLessThan>
<ogc:PropertyName>Denominator</ogc:PropertyName>
<ogc:Literal>200000</ogc:Literal>
</ogc:PropertyIsLessThan>
</ogc:Filter>
</csw:Constraint>
<ogc:SortBy xmlns:ogc="http://www.opengis.net/ogc&quot;&gt;
<ogc:SortProperty>
<ogc:PropertyName>Denominator</ogc:PropertyName>
<ogc:SortOrder>ASC</ogc:SortOrder>
</ogc:SortProperty>
</ogc:SortBy>
</csw:Query>
</csw:GetRecords>
2011-05-11 13:01:29,500 INFO [geonetwork.csw] - Dispatching operation : GetRecords
2011-05-11 13:01:29,500 INFO [geonetwork.csw] - Dispatching operation : GetRecords
2011-05-11 13:01:29,501 DEBUG [geonetwork.search] - Sorting by : [denominator,false]
2011-05-11 13:01:29,501 DEBUG [geonetwork.search] - Sorting by : [denominator,false]
2011-05-11 13:01:29,507 INFO [geonetwork.csw.search] - filterToLucene result:
<BooleanQuery>
<BooleanClause required="true" prohibited="false">
<RangeQuery fld="Denominator" upperTxt="200000" inclusive="false" />
</BooleanClause>
<BooleanClause required="true" prohibited="false">
<TermQuery fld="_isTemplate" txt="n" />
</BooleanClause>
</BooleanQuery>
2011-05-11 13:01:29,507 INFO [geonetwork.csw.search] - filterToLucene result:
<BooleanQuery>
<BooleanClause required="true" prohibited="false">
<RangeQuery fld="Denominator" upperTxt="200000" inclusive="false" />
</BooleanClause>
<BooleanClause required="true" prohibited="false">
<TermQuery fld="_isTemplate" txt="n" />
</BooleanClause>
</BooleanQuery>
2011-05-11 13:01:29,507 INFO [geonetwork.csw.search] - Unknown queryable field : _isTemplate
2011-05-11 13:01:29,507 INFO [geonetwork.csw.search] - Unknown queryable field : _isTemplate
2011-05-11 13:01:29,507 DEBUG [geonetwork.csw.search] - Query changed, reopening IndexReader
2011-05-11 13:01:29,507 DEBUG [geonetwork.csw.search] - Query changed, reopening IndexReader
2011-05-11 13:01:29,529 DEBUG [geonetwork.csw.search] - Search criteria:
<BooleanQuery>
<BooleanClause required="true" prohibited="false">
<RangeQuery fld="denominator" upperTxt="200000" inclusive="false" />
</BooleanClause>
<BooleanClause required="true" prohibited="false">
<TermQuery fld="_isTemplate" txt="n" />
</BooleanClause>
</BooleanQuery>
2011-05-11 13:01:29,529 DEBUG [geonetwork.csw.search] - Search criteria:
<BooleanQuery>
<BooleanClause required="true" prohibited="false">
<RangeQuery fld="denominator" upperTxt="200000" inclusive="false" />
</BooleanClause>
<BooleanClause required="true" prohibited="false">
<TermQuery fld="_isTemplate" txt="n" />
</BooleanClause>
</BooleanQuery>
2011-05-11 13:01:29,529 DEBUG [geonetwork.csw.search] - Search criteria:
<BooleanQuery>
<BooleanClause required="true" prohibited="false">
<RangeQuery fld="denominator" upperTxt="200000" inclusive="false" />
</BooleanClause>
<BooleanClause required="true" prohibited="false">
<TermQuery fld="_isTemplate" txt="n" />
</BooleanClause>
</BooleanQuery>
2011-05-11 13:01:29,530 DEBUG [geonetwork.search] - Analyze field denominator : 200000
2011-05-11 13:01:29,530 DEBUG [geonetwork.search] - Analyze field denominator : 200000
2011-05-11 13:01:29,530 DEBUG [geonetwork.search] - Lucene Query: denominator:{* TO 200000}
2011-05-11 13:01:29,530 DEBUG [geonetwork.search] - Lucene Query: denominator:{* TO 200000}
2011-05-11 13:01:29,530 DEBUG [geonetwork.search] - Analyze field _isTemplate : n
2011-05-11 13:01:29,530 DEBUG [geonetwork.search] - Analyze field _isTemplate : n
2011-05-11 13:01:29,530 DEBUG [geonetwork.search] - Lucene Query: _isTemplate:n
2011-05-11 13:01:29,530 DEBUG [geonetwork.search] - Lucene Query: _isTemplate:n
2011-05-11 13:01:29,530 DEBUG [geonetwork.search] - Lucene Query: +denominator:{* TO 200000} +_isTemplate:n
2011-05-11 13:01:29,530 DEBUG [geonetwork.search] - Lucene Query: +denominator:{* TO 200000} +_isTemplate:n
2011-05-11 13:01:29,530 INFO [geonetwork.csw.search] - LuceneSearcher made query:
+denominator:{* TO 200000} +_isTemplate:n
2011-05-11 13:01:29,530 INFO [geonetwork.csw.search] - LuceneSearcher made query:
+denominator:{* TO 200000} +_isTemplate:n
2011-05-11 13:01:29,530 ERROR [jeeves.dbmspool] - reconnecting: 3573403>=900000 ms since last connection
2011-05-11 13:01:29,561 DEBUG [jeeves.dbms.select] - Query : SELECT id FROM Groups
2011-05-11 13:01:29,568 DEBUG [jeeves.dbms.select] - Found 7 records in 0.0070 secs
2011-05-11 13:01:29,568 DEBUG [geonetwork.csw.search] - Lucene query: +(+denominator:{* TO 200000} +_isTemplate:n) +(_op0:3 _op0:1 _op0:0 _op0:6 _op0:-1 _op0:5 _op0:4 _owner:1)
2011-05-11 13:01:29,568 DEBUG [geonetwork.csw.search] - Lucene query: +(+denominator:{* TO 200000} +_isTemplate:n) +(_op0:3 _op0:1 _op0:0 _op0:6 _op0:-1 _op0:5 _op0:4 _owner:1)

I am running GN 2.6.1, and I do notice from the release notes that #422 addresses an issue with PropertyIsLessThan. I could upgrade if this issue has been fixed in 2.6.2/3. I do see the same kind of behaviour with PropertyIsGreaterThan and PropertyIsBetween though , so what I'd like to know is - how do I force GN (or Lucene) to make this filter comparison by treating the <gco:Integer> quantity as a number rather than a string?

Thanks in advance for any help,

David Herbert
British Antarctic Survey.

--
This message (and any attachments) is for the recipient only. NERC
is subject to the Freedom of Information Act 2000 and the contents
of this email and any reply you make may be disclosed by NERC unless
it is exempt from release under the Act. Any material supplied to
NERC may be stored in an electronic records management system.
------------------------------------------------------------------------------
Achieve unprecedented app performance and reliability
What every C/C++ and Fortran developer should know.
Learn how Intel has extended the reach of its next-generation tools
to help boost performance applications - inlcuding clusters.
http://p.sf.net/sfu/intel-dev2devmay
_______________________________________________
GeoNetwork-users mailing list
GeoNetwork-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/geonetwork-users
GeoNetwork OpenSource is maintained at http://sourceforge.net/projects/geonetwork