Hi there,
When inserting/updating features via WFS, we are delivering ISO-8859-1 encoded transaction-XML to the geoserver.
(we need to make use of ISO-8859-1 in order to have special Danish characters handled properly).
As it appears, the Geoserver will deliver UTF-8 encoded attribute-string-data to GeoTools, which in turns delivers those data as-is to JDBC.
Our PostGis-database is in Latin1 (ISO-8859-1).
When PostGis receives the UTF-8 encoded string-data, it will try to convert it to ISO-8859-1 (Latin1), and this conversion fails.
We had a closer look at what was going on in Geoserver, and were very surprised to discover how the WfsDispatcher makes use of the somewhat “obscure” XmlCharSetDetector class. Does Geoserver really need to detect the charset itself ? I should think those matters should be handled by built-in-functionality in Xalan.
Anyways, Geoserver is not able to handle ISO-8859-1. It will always default to UTF-8 when receiving ISO-8859-1.
We would very much like it to default its charset for transactions to the default charset defined with Geoserver-admin, and we need that functionality desperately.
Right know we are considering making a patch-version of the WfsDispatcher-class, which will bypass the XmlCharSetDetector.
Are we on the right track and when may we expect a version of Geoserver capable of handling IS-8859-1 ?
Sincerely,
Aron Olsen.
Hmmm... The XmlCharSetDecoder was actually in place to improve the handling of different character sets. It was a contribution, so I don't know the details, but I haven't heard anyone have problems with it before.
But it may not have been tested with Transactions. The original discussion is here: http://jira.codehaus.org/browse/GEOS-323
> Are we on the right track and when may we expect a version of Geoserver
> capable of handling IS-8859-1 ?
Sounds like you're on the right track, first step to just disable it, second to perhaps use it and have it internally output the geoserver admin default in transactions? If you provide a patch we'll be glad to include it in GeoServer. I know of no one else working on charset issues at the time, so unless you submit the patch you probably can't expect it any time soon.
best regards,
Chris
Aron Olsen wrote:
Hi there,
When inserting/updating features via WFS, we are delivering ISO-8859-1
encoded transaction-XML to the geoserver.
(we need to make use of ISO-8859-1 in order to have special Danish
characters handled properly).
As it appears, the Geoserver will deliver UTF-8 encoded
attribute-string-data to GeoTools, which in turns delivers those data as-is
to JDBC.
Our PostGis-database is in Latin1 (ISO-8859-1).
When PostGis receives the UTF-8 encoded string-data, it will try to convert
it to ISO-8859-1 (Latin1), and this conversion fails.
We had a closer look at what was going on in Geoserver, and were very
surprised to discover how the WfsDispatcher makes use of the somewhat
"obscure" XmlCharSetDetector class. Does Geoserver really need to detect the
charset itself ? I should think those matters should be handled by
built-in-functionality in Xalan.
Anyways, Geoserver is not able to handle ISO-8859-1. It will always default
to UTF-8 when receiving ISO-8859-1.
We would very much like it to default its charset for transactions to the
default charset defined with Geoserver-admin, and we need that functionality
desperately.
Right know we are considering making a patch-version of the
WfsDispatcher-class, which will bypass the XmlCharSetDetector.
Are we on the right track and when may we expect a version of Geoserver
capable of handling IS-8859-1 ?
Sincerely,
Aron Olsen.
--
Chris Holmes
The Open Planning Project
http://topp.openplans.org