The “charset” parameter is used with some media types to define the character set (section 3.4) of the data. When no explicit charset parameter is provided by the sender, media subtypes of the “text” type are defined to have a default charset value of “ISO-8859-1” when received via HTTP. Data in character sets other than “ISO-8859-1” or its subsets MUST be labeled with an appropriate charset value.
It looks like when Geoserver sends out something else than ISO-8859-1 the character set should be marked. We had sometimes troubles with Finnish characters and uDig even we used ISO-8859-1, the default character set.
In both cases I use Tomcat. I have checked the response with wfetch, and Geoserver 1.7.2 behavior is different from Geoserver 2.3.0. The problem only occurres when GetFeatureInfo format is text/plain, otherwise charset is defined in the html or xml header. Unfortunately, Geonetwork seem to use text/plain. However, the root of the problem is the ambiguous output of Geoserver.
If there is no way to make Geonetwork assume that the undefined charset is UTF-8, is there any workaround for the problem in Geoserver or tomcat?
Not that I know of. As said before, I have a hard time considering this a bug too, I don’t remember OGC demanding the charset to be added to the
The "charset" parameter is used with some media types to define the
character set (section 3.4) of the data. When no explicit charset parameter
is provided by the sender, media subtypes of the "text" type are defined to
have a default charset value of "ISO-8859-1" when received via HTTP. Data
in character sets other than "ISO-8859-1" or its subsets MUST be labeled
with an appropriate charset value.****
** **
It looks like when Geoserver sends out something else than ISO-8859-1 the
character set should be marked. We had sometimes troubles with Finnish
characters and uDig even we used ISO-8859-1, the default character set.
Interesting. Please open a bug report then.
To anyone reading the message: patches gladly accepted :-p
The “charset” parameter is used with some media types to define the character set (section 3.4) of the data. When no explicit charset parameter is provided by the sender, media subtypes of the “text” type are defined to have a default charset value of “ISO-8859-1” when received via HTTP. Data in character sets other than “ISO-8859-1” or its subsets MUST be labeled with an appropriate charset value.
It looks like when Geoserver sends out something else than ISO-8859-1 the character set should be marked. We had sometimes troubles with Finnish characters and uDig even we used ISO-8859-1, the default character set.
Interesting. Please open a bug report then.
To anyone reading the message: patches gladly accepted :-p
Cheers
Andrea