Hi list, here is a first demo adding support for thesaurus in Geonetwork
(based on alpha2 version - thanks Andrea :).
Thesaurus support allows import/export of "external" thesaurus such as
Agrovoc, Gemet or other in rdf/skos format and creation of "local" thesaurus
which will be entirely managed by the administrator of the geonetwork node.
Local thesaurus could be basic (id, label, definition and lang for each
term) used for thesaurus of type discipline, theme, temporal or of type
place where you could also defined a bounding box for each term. Thesaurus
creation only support flat thesaurus (ie. you can not defined narrower,
broader and related term).
When editing metadata, geonetwork will provide an autocompletion field for
keyword searching in all thesaurus available in the current node.
When searching for metadata, in the search field keyword, an autocompletion
process is also available.
Demo is http://dev.sandre.eaufrance.fr/gnthesaurus/ login admin / password
admin. Go to the administration > Manage thesauri. The demo is set up with 3
external thesauri (agrovoc, a thesaurus for the water domain, and the
Geonetwork list of regions) and 2 local ones.
Main features available for now :
- Administration :
- thesaurus import/export in RDF skos format
- thesaurus creation/edition for "flat" thesaurus
- support for multilingual thesaurus.
- Edit interface :
- Autocompletion for field "keyword" in DC, FGDC and iso19115 (to be done
for ISO19139)
- Search interface :
- Autocompletion for search criteria "keyword"
For the time being, the user interface query all thesaurus in the current
webpage language. But this not really the best option. For the first
implementation, maybe I proposed 2 options for the editing part :
- if a language parameter is defined in the current metadata, query the
thesaurus in that language
- if not use the webpage language or we could also defined a parameter
"metatada default language" ? This will allow user not having geonetwork
translation (for now en, fr, es, cn) to use their own language for
thesaurus.
For the search this is a more complex part. For now, if the user interface
is in spanish, it will look for term in spanish in each thesaurus. If the
metadata are in english ... it will not be relevant. Here as a first
implementation, we could at least use a parameter "metatada default
language". Other and more complex options could be, for the keyword criteria
:
- search for a term in user interface language and then query using the
keyword and its translation. Here users need to have multilingual thesaurus.
- proposed the list of keyword found by Lucene in metadata in the indexing
process. That way, geonetwork propose only keyword that are used somewhere
in one or more metadata
- analyzed Lucene support for thesaurus ...
The thesaurus are stored in a RDF/SKOS format in the /xml/codelist/thesauri
directory of Geonetwork. They are loaded on startup and RDF manipulation is
made with Sesame, an opensource java library for RDF
(http://www.openrdf.org/).
Thanks to Laurent Magnien, Arnaud Dupuis and Jean-Pascal Boignard for their
first ("french") implementation.
Any comments, ideas, bugs ? If that sounds good to the list, I could put it
on the alpha 2 version.
Cheers. Francois
--
Ce message a ete verifie par MailScanner pour des virus ou des polluriels et rien de suspect n'a ete trouve.
Les donnees et renseignements contenus dans ce message sont personnels, confidentiels et prives. Toute publication, utilisation ou diffusion, meme partielle, doit etre autorisee.
Any data and information contained in this electronic mail is personal, confidential and secret. Any total or partial publication, use or distribution must be authorized.