[Geoserver-devel] [jira] Created: (GEOS-418) Character set encoding for wfs: html content-type differe

Character set encoding for wfs: html content-type differe
---------------------------------------------------------

         Key: GEOS-418
         URL: http://jira.codehaus.org/browse/GEOS-418
     Project: GeoServer
        Type: Bug
  Components: WFS
    Versions: 1.3-rc3
Environment: Linux Suse 9.3, Tomcat 5.5, geoserver 1.3RC2 with several Oracle patches, Oracle 10gR2, Java 1.5
Reporter: Hans-Ulrich Otto
Assigned to: dblasby

Could you report this as a bug? I'm a bit confused by our code, since
it appears to me that the mimeType should be right and the header
should be wrong, but I think it's that the GML delegate is not
reporting things right. See:
https://svn.codehaus.org/geoserver/trunk/geoserver/src/org/vfny/geoserver/wfs/responses/GML2FeatureResponseDelegate.java
There is getContentType and getContentEncoding, and the encoding one
just returns null. Though this looks to be because the ContentType one
passes in the GeoServer config object, while the ContentEncoding
doesn't. I'm not too sure about the logic behind this, like exactly
when they should be different. I do know that in many cases the
reponse will return differently according to what was requested (jpeg
vs. png in wms). Could try just having the AbstractServlet set the
header and the mimetype both with getContentType - but that may easily
mess up other responses... This stuff needs a fine tooth comb to make
sure it works right.

C

Quoting Hans-Ulrich Otto <hans.ulrich.otto@anonymised.com>:

Hi all,

we have done some further testing of the character set encoding anf
found an interesting effect. As said, we are using an Oracle 10g
datastore, where our data are encoded in ISO Latin 1, 2 , ...
character
sets. We are still using geoserver 1.3RC2 (with some patches).

When setting the character set encoding in services.xml to UTF-8, the
wfs seems to deliver some wrong encoded documents, at least Mozilla,
IE
and Netscape 7.1 are not properly display them:

<?xml version="1.0" encoding="UTF-8"?>
<wfs:FeatureCollection xmlns:wfs="http://www.opengis.net/wfs&quot;
xmlns:ta="http://www.teleatlas.com"
xmlns:gml="http://www.opengis.net/gml&quot;
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance&quot;
xsi:schemaLocation="http://www.teleatlas.com

http://highway.teleatlas.com:8891/geoserver/wfs/DescribeFeatureType?typeName=ta:MN_PI

http://www.opengis.net/wfs

http://highway.teleatlas.com:8891/geoserver/data/capabilities/wfs/1.0.0/WFS-basic.xsd&quot;&gt;

  <gml:boundedBy>
    <gml:Box srsName="http://www.opengis.net/gml/srs/epsg.xml#4326&quot;&gt;
      <gml:coordinates xmlns:gml="http://www.opengis.net/gml&quot;
decimal="." cs="," ts=" ">22.2264244,60.4312667
22.2540969,60.4493704</gml:coordinates>
    </gml:Box>
  </gml:boundedBy>
  <gml:featureMember>
    <ta:MN_PI fid="MN_PI.12460300028589">
      <ta:FEATTYP>7315</ta:FEATTYP>
      <ta:IMPORT>0</ta:IMPORT>
      <ta:ARNAMELC>SWE</ta:ARNAMELC>
      <ta:NAME>Pizzeria Napoli</ta:NAME>
      <ta:STNAME>TrädgÃ¥rdsgatan</ta:STNAME>
      <ta:STNAMELC>SWE</ta:STNAMELC>
      <ta:POSTCODE>20100</ta:POSTCODE>
      <ta:MUNID>12460058000416</ta:MUNID>
      <ta:MUNCD>853</ta:MUNCD>
      <ta:MUNNAME>Ã...bo</ta:MUNNAME>
      <ta:BUANAME>Ã...bo</ta:BUANAME>
      <ta:TELNUM>+(358)-(2)-2511650</ta:TELNUM>
      <ta:CLTRPELID>12460000575794</ta:CLTRPELID>
      <ta:RELPOS>72</ta:RELPOS>
      <ta:ADDRPID>12460000011421</ta:ADDRPID>
      <ta:GEOM>
        <gml:Point
srsName="http://www.opengis.net/gml/srs/epsg.xml#4326&quot;&gt;
          <gml:coordinates xmlns:gml="http://www.opengis.net/gml&quot;
decimal="." cs="," ts=" ">22.2525536,60.4493704</gml:coordinates>
        </gml:Point>

      </ta:GEOM>
    </ta:MN_PI>
  </gml:featureMember>
[...]

When setting the charset encoding in services.xml to ISO-8859-1, the
the
result is displayed properly in the mentioned browsers:

<?xml version="1.0" encoding="ISO-8859-1"?>
<wfs:FeatureCollection xmlns:wfs="http://www.opengis.net/wfs&quot;
xmlns:ta="http://www.teleatlas.com"
xmlns:gml="http://www.opengis.net/gml&quot;
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance&quot;
xsi:schemaLocation="http://www.teleatlas.com

http://highway.teleatlas.com:8892/geoserver/wfs/DescribeFeatureType?typeName=ta:MN_PI

http://www.opengis.net/wfs

http://highway.teleatlas.com:8892/geoserver/data/capabilities/wfs/1.0.0/WFS-basic.xsd&quot;&gt;

  <gml:boundedBy>
    <gml:Box srsName="http://www.opengis.net/gml/srs/epsg.xml#4326&quot;&gt;
      <gml:coordinates xmlns:gml="http://www.opengis.net/gml&quot;
decimal="." cs="," ts=" ">22.2264244,60.4312667
22.2540969,60.4493704</gml:coordinates>
    </gml:Box>
  </gml:boundedBy>
<gml:featureMember>
    <ta:MN_PI fid="MN_PI.12460300028589">
      <ta:FEATTYP>7315</ta:FEATTYP>
      <ta:IMPORT>0</ta:IMPORT>
      <ta:ARNAMELC>SWE</ta:ARNAMELC>
      <ta:NAME>Pizzeria Napoli</ta:NAME>
      <ta:STNAME>Trädgårdsgatan</ta:STNAME>
      <ta:STNAMELC>SWE</ta:STNAMELC>
      <ta:POSTCODE>20100</ta:POSTCODE>
      <ta:MUNID>12460058000416</ta:MUNID>
      <ta:MUNCD>853</ta:MUNCD>
      <ta:MUNNAME>Åbo</ta:MUNNAME>
      <ta:BUANAME>Åbo</ta:BUANAME>
      <ta:TELNUM>+(358)-(2)-2511650</ta:TELNUM>
      <ta:CLTRPELID>12460000575794</ta:CLTRPELID>
      <ta:RELPOS>72</ta:RELPOS>
      <ta:ADDRPID>12460000011421</ta:ADDRPID>
      <ta:GEOM>
        <gml:Point
srsName="http://www.opengis.net/gml/srs/epsg.xml#4326&quot;&gt;
          <gml:coordinates xmlns:gml="http://www.opengis.net/gml&quot;
decimal="." cs="," ts=" ">22.2525536,60.4493704</gml:coordinates>
        </gml:Point>
      </ta:GEOM>
    </ta:MN_PI>
  </gml:featureMember>
[...]

Interestingly, the wms service renders the names always in a proper
way,
irrespective of the encoding specified in services.xml.
The question is now, what goes wrong here. We have some glue: The xml
document header indicates the proper encoding. However, when looking
into the properties of the page in the browser, surprisingly the
content-type is: text/html;charset=ISO-8859-1.

It seems that the wfs service *does* a proper encoding, however the
http
server indicates always the ISO-8859-1 encoding for the document
received.

Has anybody an idea how this can be fixed? Probably the http
content-type should be set dynamically to the same value as the char
set
encoding in the services.xml?

Thanks a lot in advance,

Uli.

--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://jira.codehaus.org/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira