#612: XML files loaded by GeoNetwork may not be in UTF-8 charset and may fail to
load
---------------------+------------------------------------------------------
Reporter: simonp | Owner: geonetwork-devel@…
Type: defect | Status: new
Priority: major | Milestone: v2.7.0
Component: General | Version:
Keywords: |
---------------------+------------------------------------------------------
Some XML files have not UTF-8 chars in them - usually WINDOWS-1252 because
these are often pasted into metadata fields from Microsoft apps. The user
doesn't realize that the XML file is then no longer UTF-8 and receives
strange errors such as 'Error on line 118 of document file:/home/simon
/bioreg-test/caab37020028.xml: Invalid byte 2 of 3-byte UTF-8 sequence'
when they try to load the XML file into GeoNetwork.
GeoNetwork needs to detect the character set of file content and convert
to UTF-8 before attempting to load. Can do this using charset detectors
often used in browsers. There is a patch attached which adds the mozilla
juniversalcharsetdetector to the loadFile method in the Jeeves
utils/Xml.java class. A system property
(jeeves.filecharsetdetectandconvert) must be set to enable charset
detection and conversion (it is disabled by default or if missing).
--
Ticket URL: <http://trac.osgeo.org/geonetwork/ticket/612>
GeoNetwork opensource Developer website <http://sourceforge.net/projects/geonetwork/>
GeoNetwork opensource is a standards based, Free and Open Source catalog application to manage spatially referenced resources through the web. It provides powerful metadata editing and search functions as well as an embedded interactive web map viewer. This website contains information related to the development of the software.