Hi,
The issue is with some language names, copyright chars, and also some of
the space characters are coming up as not UTF-8 compliant. Attached is
the XML file for these errors.
2008-10-23 14:45:35,803 WARN [jeeves] - Non UTF-8 char found at Line:
191 col: 15 char (decimal): 241
2008-10-23 14:45:35,803 WARN [jeeves] - Non UTF-8 char found at Line:
204 col: 15 char (decimal): 231
2008-10-23 14:45:35,803 WARN [jeeves] - Non UTF-8 char found at Line:
238 col: 24 char (decimal): 160
2008-10-23 14:45:35,803 WARN [jeeves] - Non UTF-8 char found at Line:
238 col: 28 char (decimal): 160
2008-10-23 14:45:35,803 WARN [jeeves] - Non UTF-8 char found at Line:
254 col: 91 char (decimal): 174
2008-10-23 14:45:35,803 WARN [jeeves] - Non UTF-8 char found at Line:
256 col: 70 char (decimal): 174
2008-10-23 14:45:35,818 WARN [jeeves] - Non UTF-8 char found at Line:
5935 col: 64 char (decimal): 146
2008-10-23 14:45:35,818 WARN [jeeves] - Non UTF-8 char found at Line:
6436 col: 85 char (decimal): 150
Once removed the produced PDF is available, and doesn't seem to use any
of these UTF-8 affected elements. I fixed these by substitution of the
non-UTF-8 chars with '?'. This isn't the most elegant fix. We could
follow through the style sheet to see what elements the FOP wants, and
perhaps these should be stripped out into a new element before
transformation, but even those could have non-UTF-8 chars. So at some
point either the non-UTF-8 chars get stripped/swapped, or double byte
encoding is used. Does Saxon let us use other encoding? I'll track down
some doco on saxon and have a read.
Cheers,
Kevin
-----Original Message-----
From: Simon Pigot [mailto:Simon.Pigot@anonymised.com]
Sent: Thursday, 23 October 2008 10:04 AM
To: Kevin Gunn
Cc: Stephen.Davies@anonymised.com; geonetwork-devel@lists.sourceforge.net
Subject: Re: [GeoNetwork-devel] PDFPrint fails with
exception[SEC=UNCLASSIFIED]
Kevin,
Before doing that, could you check and see whether there is anything
more specific in jetty/logs/output.log? Occasionally saxon puts more
info about the problem including line/column numbers in there.
Cheers and thanks,
Simon
Kevin Gunn wrote:
Hi Steve,
Thx for the response. I don't feed it any doc, it takes the entire
Jeeves request and puts it through the XSLT FOP transformation. I'll
try to track down exactly where in the XML it's having issues. It
could be DB related as the md sub-xml sections come from the DB. The
records I'm testing with are straight copies of the default
ISO19139.mcp template with a new title. We're using oracle as the DB
and the default driver that comes with the latest GN libs.
I'll wack a little method into Xml.java to check the chars in the XML
being transformed; perhaps something like this could be added into the
current impl to fail it nicely.
Are you guys using this latest source as your Geonetwork production
version?
Cheers,
Kevin
------------------------------------------------------------------------
*From:* Stephen.Davies@anonymised.com [mailto:Stephen.Davies@anonymised.com]
*Sent:* Wednesday, 22 October 2008 15:44 PM
*To:* geonetwork-devel@lists.sourceforge.net
*Subject:* Re: [GeoNetwork-devel] PDFPrint fails with
exception[SEC=UNCLASSIFIED]
Hi Kevin,
I've encountered UTF encoding issues in the past. Generally my first
port of call is to check that the data is actually valid UTF-8. I've
attached a utility that simply dumps details of non-ascii characters
to the console. This will give you a starting point. I generally use
textpad to view the file in hex once the potential problem characters
have been found. Not all programs, e.g. Windows notepad, behave
correctly when 'special' characters exist (nor does it handle XML
encoding declarations).
You may also like to consider the database settings and support for
UTF-8. I think MySQL requires additional settings in the JDBC connect
string.
Cheers,
Steve
-----Original Message-----
*From:* Kevin Gunn [mailto:k.gunn@anonymised.com]
*Sent:* Wednesday, 22 October 2008 4:20
*To:* geonetwork-devel@lists.sourceforge.net
*Subject:* [GeoNetwork-devel] PDFPrint fails with exception
Hi,
The latest BlueNetMEST's PDFPrint fails on exception for me. Is there
a known solution for this?
C:\_work\geonetwork\BlueNet_MEST_SVN\web\geonetwork\/xsl/portal-present-
fop.xsl
2008-10-22 13:53:44,755 ERROR [jeeves.service] - -> (C) message :
org.xml.sax.SAXParseException: Invalid byte 2 of 4-byte UTF-8
sequence.
2008-10-22 13:53:44,755 ERROR [jeeves.service] - -> (C) exception :
XPathException
2008-10-22 13:53:44,755 DEBUG [jeeves.service] - Raised exception
while executing service
<error id="error">
<message>org.xml.sax.SAXParseException: Invalid byte 2 of 4-byte UTF-8
sequence.</message>
<class>XPathException</class>
<stack>
<at class="net.sf.saxon.event.Sender" file="Sender.java" line="362"
method="sendSAXSource" />
<at class="net.sf.saxon.event.Sender" file="Sender.java" line="184"
method="send" />
<at class="net.sf.saxon.event.Sender" file="Sender.java" line="49"
method="send" />
<at class="net.sf.saxon.Controller" file="Controller.java" line="1550"
method="transform" />
<at class="jeeves.utils.Xml" file="Xml.java" line="265"
method="transformFOP" />
<at class="jeeves.server.dispatchers.ServiceManager"
file="ServiceManager.java" line="580" method="dispatchOutput" />
<at class="jeeves.server.dispatchers.ServiceManager"
file="ServiceManager.java" line="383" method="dispatch" />
<at class="jeeves.server.JeevesEngine" file="JeevesEngine.java"
line="621" method="dispatch" />
<at class="jeeves.server.sources.http.JeevesServlet"
file="JeevesServlet.java" line="163" method="execute" />
<at class="jeeves.server.sources.http.JeevesServlet"
file="JeevesServlet.java" line="88" method="doGet" />
</stack>
<request>
<language>en</language>
<service>pdf.search</service>
</request>
<response>
<summary count="1" type="local">
<keywords />
<categories />
<sources>
<source count="1" name="87aa46b0-a57f-4f33-8087-effe4c4dfcc5" />
</sources>
</summary>
</response>
</error>
Thx,
Kevin Gunn
Software Engineer
Australian Institute of Marine Science
Ph: (07) 47534305
Fax: (07) 4772 5852
E-mail: k.gunn@anonymised.com
------------------------------------------------------------------------
|
The information contained in this communication is for the use of the
|
individual or entity to whom it is addressed, and may contain |
information which is the subject of legal privilege and/or copyright.
|
If you have received this communication in error, please notify the |
sender by return E-Mail and delete the transmission, together with any
|
attachments, from your system. Thank you. |
------------------------------------------------------------------------
|
--
------------------------------------------------------------------------
The information contained in this communication is for the use of the
individual or entity to whom it is addressed, and may contain
information which is the subject of legal privilege and/or copyright.
If you have received this communication in error, please notify the
sender by return email and delete the transmission, together with any
attachments, from your system. Thank you.
------------------------------------------------------------------------
--
------------------------------------------------------------------------
The information contained in this communication is for the use of the
individual or entity to whom it is addressed, and may contain
information which is the subject of legal privilege and/or copyright.
If you have received this communication in error, please notify the
sender by return email and delete the transmission, together with any
attachments, from your system. Thank you.
------------------------------------------------------------------------
------------------------------------------------------------------------
------------------------------------------------------------------------
-
This SF.Net email is sponsored by the Moblin Your Move Developer's
challenge
Build the coolest Linux based applications with Moblin SDK & win great
prizes
Grand prize is a trip for two to an Open Source event anywhere in the
world
http://moblin-contest.org/redirect.php?banner_id=100&url=/
------------------------------------------------------------------------
_______________________________________________
GeoNetwork-devel mailing list
GeoNetwork-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/geonetwork-devel
GeoNetwork OpenSource is maintained at
http://sourceforge.net/projects/geonetwork
--
------------------------------------------------------------------------
The information contained in this communication is for the use of the
individual or entity to whom it is addressed, and may contain
information which is the subject of legal privilege and/or copyright.
If you have received this communication in error, please notify the
sender by return email and delete the transmission, together with any
attachments, from your system. Thank you.
------------------------------------------------------------------------
(attachments)
PDFPrint_XML.xml (598 KB)