hello Pierre,
GeoNetwork already implements fuzzy search: in the advanced search parameters you can set fuzziness level. The default is not fuzzy. You could easily set the default to your desired level of fuzziness in your implementation.
I think the search is already case-insensitive.
To abstract over accentuated characters, the Lucene ISOLatin1AccentFilter must be applied to both indexing and search values. I’m sure this has been done in some branches but I’m not sure that is currently in the trunk or GN2.4.x branches.
To also fulfill your parts-of-word search, it is sufficient to change the search logic to include a wildcard at the end of each search term.
Hope this helps,
Heikki Doeleman
On Thu, Mar 11, 2010 at 8:46 PM, Pierre Mauduit <pierre.mauduit@anonymised.com> wrote:
Hello,
A client recently asked us if it was possible to implement a “fuzzy”
search on GeoNetwork ; since I am pretty new to GeoNetwork
development, I don’t really know if a current work has been done in
some branches about it. The idea would be to implement a
case-insensitive search, as well as accentuated-insensitive search. in
addition, our client wanted us to add the capability to search parts
of word.
For example, the search “bâti” may be able to return all metadatas
concerned by the following keywords : “Bati”, “batî”, “batiment”, …
Does anyone know if something in this direction has been tried in one
of the branches, or if you have some pointers where I could look at in
order to achieve this kind of development ? I have heard of the
“token” parameter onto lucene’s configuration files (in order to allow
“incomplete keywords” search), but don’t know much about it.
Thanks in advance for the answers,
–
Pierre Mauduit
Camptocamp France SAS
Savoie Technolac, BP 352
73377 Le Bourget du Lac Cedex
Tel : + 33 (0)4 79 44 44 92
http://www.camptocamp.com
pierre.mauduit@anonymised.com
Download Intel® Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs
proactively, and fine-tune applications for parallel performance.
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev
GeoNetwork-devel mailing list
GeoNetwork-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/geonetwork-devel
GeoNetwork OpenSource is maintained at http://sourceforge.net/projects/geonetwork