[GeoNetwork-devel] More (better?) XSD validation errors and XSLT2.0 support [SEC=UNCLASSIFIED]

Hi Simon and all,

I have used Saxon9 to translate my XML documents very successfully. Michael
Kay is virtually the creator of XSLT and XSLT 2.0 He also writes Saxon so it
would seem to me that he would make sure that his software would meet the
XSLT 2.0 specification as best as he can.

I have used other parsers like XMLSpy and found that they do not fully
implement the specifications of some XML standards. Hence where ever
possible I have gone to the software written by the specification writer.
IE. Michael Kay, James Clarke.

I strongly support the use of Saxon as the validation tool. Also the more
that a user can use to find out where the errors occur in their metadata is
also more useful.

My two cents worth.

Thanks.

John

-----Original Message-----
From: geonetwork-devel-bounces@lists.sourceforge.net
[mailto:geonetwork-devel-bounces@lists.sourceforge.net] On
Behalf Of Simon Pigot
Sent: Sunday, 24 February 2008 3:54 AM
To: geonetwork-devel@lists.sourceforge.net
Cc: anzlicmet-l@anonymised.com
Subject: [GeoNetwork-devel] More (better?) XSD validation
errors and XSLT2.0 support

Hi,

I've been doing a bit of investigation into JAXP in GN with a view to
getting XSLT 2.0 support and improvements to the XSD validation error
messages - turns out that the current approach where Jeeves throws an
exception on the first validation error returned from the
JAXP validator
class, loses additional messages that can help provide really useful
context for some error messages. For example, leaving a gco:Distance
element blank causes a fairly cryptic error to be returned by the
current validator in GN/Jeeves:

org.xml.sax.SAXParseException: cvc-datatype-valid.1.2.1: '' is not a
valid value for 'double'.

Not terribly helpful because it doesn't tell you which
element has the
invalid value. A fairly simple mod is to insert an error handler and
collect all the errors and then throw an exception
afterwards. With this
in place the validation messages returned from the above become:

ERROR(1) org.xml.sax.SAXParseException:
cvc-datatype-valid.1.2.1: '' is
not a valid value for 'double'.
ERROR(2) org.xml.sax.SAXParseException: cvc-complex-type.2.2: Element
'gco:Distance' must have no element [children], and the value
must be valid.

Ok - still a bit cryptic - but at least we can guess that the element
that doesn't have a valid value is gco:Distance and that
makes it a lot
easier to find the problem area in the XML view. Possibly
useful to is
that we can also get more validation errors if there are
other problems
in the document, like this one which has a blank gco:Distance
and also
doesn't have a topicCategory code:

ERROR(1) org.xml.sax.SAXParseException:
cvc-datatype-valid.1.2.1: '' is
not a valid value for 'double'.
ERROR(2) org.xml.sax.SAXParseException: cvc-complex-type.2.2: Element
'gco:Distance' must have no element [children], and the value
must be valid.
ERROR(3) org.xml.sax.SAXParseException:
cvc-enumeration-valid: Value ''
is not facet-valid with respect to enumeration '[farming, biota,
boundaries, climatologyMeteorologyAtmosphere, economy, elevation,
environment, geoscientificInformation, health,
imageryBaseMapsEarthCover, intelligenceMilitary,
inlandWaters, location,
oceans, planningCadastre, society, structure, transportation,
utilitiesCommunication]'. It must be a value from the enumeration.
ERROR(4) org.xml.sax.SAXParseException: cvc-type.3.1.3: The
value '' of
element 'gmd:MD_TopicCategoryCode' is not valid.

and to make things just a little bit nicer we could let the
user select
the number of validation messages they want to see as part of their
session. Might speed up the process of fixing these problems a bit...

It's still a pretty cryptic interface but at least there is a
bit more
info to help with context and maybe it begins to open the door for
connecting up these error messages with the XML/advanced editor views
(It's a shame that these errors return line number and column
number of
-1 - more investigation required - but it might be related to the in
memory source we're using....)

XSLT 2.0 support - we have a few quite large XSLT convertors being
written (eg. for GCMD DIF to ISO 19115/19139) that we'd like
to use in
GN. The writers have chosen to use XSLT 2.0 and it does kind of work
within GN now so long as the writers use xalan constructs, however it
seems nice to have XSLT 2.0 support (I think?) and most of
the converter
writers seem to be using SAXON, so I've done a trial by
switching over
our GN 2.2 working copy to use SAXON in JAXP - GN and
Intermap seem to
work ok once you convert the xalan functions in a few of the xslts
(easy) - but is this something people want? And, I'm sure someone has
probably tried this before, so maybe there are reasons why we
stick with
xalan and XSLT 1.0?

Cheers,
Simon

--------------------------------------------------------------
-----------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
_______________________________________________
GeoNetwork-devel mailing list
GeoNetwork-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/geonetwork-devel
GeoNetwork OpenSource is maintained at
http://sourceforge.net/projects/geonetwork