Hi list, regarding the bug on having "-" character in query, one solution could be the use of fuzzyQuery in Lucene. fuzzyQuery is less strict than termQuery.
I made some tests adding 2 parameters on the interface :
- fuzzy (on/off)
- similarity : float default 0.8
When querying using demo data :
- "Hydrological" + fuzzy off return 1 result "Hydrological basins in Africa (SAMPLE DATA!)"
- "Hydrological" + fuzzy on return 1 result "Hydrological basins in Africa (SAMPLE DATA!)"
- "Hidrological" + fuzzy off return 0 result
- "Hidrological" or "Hidrologicàl" + fuzzy on return 1 result "Hydrological basins in Africa (SAMPLE DATA!)"
- "Hidrological" + fuzzy on + similarity = 0.2 return 2 results "Hydrological basins in Africa (SAMPLE DATA!)" +
Forests and Drylands Programme: Forests Homepage (SAMPLE DATA) ... I don't know why but this is "fuzzy"
FuzzyQuery could be relevant when having special character "éàèôï..." and could be easier than searching in java for special character and puting ? to the TermQuery to find something.
Any comments on that point ?
Francois.
PS : Changed made for testing :
_________________________________________________________________
Add 2 form elements to the main page / Main-page.xsl :
Fuzzy : <input type="checkbox" class="content" name="fuzzy"/><br/>
Similarity : <input class="content" name="similarity" size="2" value=".8"/><br/>
_________________________________________________________________
Add a FuzzyQuery type to the Lucene.xsl and use it when fuzzy is on :
<xsl:variable name="fuzzy" select="string(/request/fuzzy)"/>
<xsl:variable name="similarity" select="/request/similarity"/>
<!-- simple string -->
<xsl:otherwise>
<xsl:choose>
<xsl:when test="$fuzzy='on'">
<FuzzyQuery fld="{$field}" txt="{$expr/@text}" sim="{$similarity}"/>
</xsl:when>
<xsl:otherwise>
<TermQuery fld="{$field}" txt="{$expr/@text}"/>
</xsl:otherwise>
</xsl:choose>
</xsl:otherwise>
_________________________________________________________________
Just do the FuzzyQuery in Java / LuceneSearcher.java line 199 :
else if (name.equals("FuzzyQuery"))
{
String fld = xmlQuery.getAttributeValue("fld");
Float sim = Float.valueOf(xmlQuery.getAttributeValue("sim"));
String txt = xmlQuery.getAttributeValue("txt").toLowerCase();
return new FuzzyQuery(new Term(fld, txt), sim.floatValue());
}
-----Message d'origine-----
De : geonetwork-users-admin@lists.sourceforge.net [mailto:geonetwork-users-admin@lists.sourceforge.net] De la part de Jeroen Ticheler
Envoyé : vendredi 17 février 2006 15:03
À : Giaccio Roberto; François Prunayre
Cc : geonetwork-users@lists.sourceforge.net
Objet : Re: [GeoNetwork-users] Search problem when having "-" character in query
I filed a bug report for this.
Jeroen
On 1 Feb 2006, at 12:41, Roberto Giaccio wrote:
Ciao Francois,
I think that the string containing "-" is split into works by Lucene
when the metadata is indexed, but not when it is used as a search
term.
I have to check and see how to solve this.Roberto
On 31 Jan 2006, at 15:16, François Prunayre wrote:
Hi list, I noticed one problem when having "-" character in query
Searching for Eure loir get 70 results
http://sandre.eaufrance.fr/geonetwork/srv/fr/main.search?
extended=off&remote=off&attrset=geo&any=Eure+loir&hitsPerPage=10Searching for Eure-et-Loir get 0 results
http://sandre.eaufrance.fr/geonetwork/srv/fr/main.search?
extended=off&remote=off&attrset=geo&any=Eure-et-Loir&hitsPerPage=10Searching for "Eure-et-Loir" get 0 results
http://sandre.eaufrance.fr/geonetwork/srv/fr/main.search?
extended=off&remote=off&attrset=geo&any=%22Eure-et-Loir%
22&hitsPerPage=10Any ideas one what's wrong ?
Thanks for your help. Francois
--
Ce message a ete verifie par MailScanner pour des virus ou des
polluriels et rien de suspect n'a ete trouve.Les donnees et renseignements contenus dans ce message sont
personnels, confidentiels et privés.Toute publication, utilisation ou
diffusion, meme partielle, doit etre autorisee.Any data and information contained in this electronic mail is
personal, confidential and private. Any total or partial publication,
use or distribution must be authorized.-------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc. Do you grep through
log files
for problems? Stop! Download the new AJAX search engine that makes
searching your log files as easy as surfing the web. DOWNLOAD
SPLUNK!
http://sel.as-us.falkag.net/sel?cmdlnk&kid3432&bid#0486&dat1642
_______________________________________________
GeoNetwork-users mailing list
GeoNetwork-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/geonetwork-users
GeoNetwork OpenSource is maintained at http://sourceforge.net/
projects/geonetwork
áŠÄ…ë^™¨¥ŠË)¢{(ç[É*eºyÀèÊ‹ êm†º.‚hø¥zÇ讚ènW¦±+h¤:0žZvØ^ì $ìyªÜ…éàŠw…«fjG¬±æ«r§ƒ*.®Z ~)^±«jÌš²Ë«~)à¶°y°ÎXÒÎ 4-CJ†Ûiÿû•«.±ö¥‘¨'zßìzW&vYä’'uÓ~7Ù¸Û}8ó§Z·]µë†zƒ^·+’ë®ÉšŠX§‚X¬´g¨5ëp¢¹.±êì–+-²Ê.ÇŸ¢¸ëa¶Úlÿùb²Û,¢êÜyú+éÞ·ùb²Û?–+-Šwèþ ¨ëp¢¹.±êìê
zÜ(®C©zt¨º·ŠÉšŠ{ZŠwjØm¶Ÿÿ²‹«qçè® §zß鮈ÞrÛ?ê'zÜ(®-- Ce message a ete verifie par MailScanner pour des virus ou des polluriels et rien de suspect n'a ete trouve.
Les donnees et renseignements contenus dans ce message sont personnels, confidentiels et prives. Toute publication, utilisation ou diffusion, meme partielle, doit etre autorisee.
Any data and information contained in this electronic mail is personal, confidential and secret. Any total or partial publication, use or distribution must be authorized.
--
Ce message a ete verifie par MailScanner pour des virus ou des polluriels et rien de suspect n'a ete trouve.
Les donnees et renseignements contenus dans ce message sont personnels, confidentiels et prives. Toute publication, utilisation ou diffusion, meme partielle, doit etre autorisee.
Any data and information contained in this electronic mail is personal, confidential and secret. Any total or partial publication, use or distribution must be authorized.