[Geoserver-devel] An old problem with gml 2.1.2 schema validation

Hey all,

I think I've written about this before, but I'll recap it again and see if there's objection to the fix.

The gml 2.1.2 schemas (as found here http://schemas.opengis.net/gml/2.1.2/) are not valid. They are *almost* valid, and of the many different validation engines out there (MSXML, the w3c validator, xerces-j, etc.) only a small number of them catch the error. It's an obscure part of the XML schema spec, but I asked once a long long time ago on the w3c-schema-dev list and they did conclude that the gml 2.1.2 spec is indeed invalid, and that xerces-j does actually catch the error.

To solve this, I tweaked the geometry.xsd file just a tiny bit to make it valid. I don't think it actually changes the content-model at all, it just makes the schema valid.

I think fixing the schema is a good thing, and will make many people attempting to validate WFS responses with strict validators (for example, xerces-j).

Shall I update the geometry.xsd file in geoserver-trunk to fix this issue? This will allow me to actually validate WFS responses using xerces-j-based parsers (for example, everything in the open-source java world).

I can make a JIRA issue and put comments on it if it's a big enough deal.

--saul

Saul Farber ha scritto:

Hey all,

I think I've written about this before, but I'll recap it again and see if there's objection to the fix.

The gml 2.1.2 schemas (as found here http://schemas.opengis.net/gml/2.1.2/) are not valid. They are *almost* valid, and of the many different validation engines out there (MSXML, the w3c validator, xerces-j, etc.) only a small number of them catch the error. It's an obscure part of the XML schema spec, but I asked once a long long time ago on the w3c-schema-dev list and they did conclude that the gml 2.1.2 spec is indeed invalid, and that xerces-j does actually catch the error.

To solve this, I tweaked the geometry.xsd file just a tiny bit to make it valid. I don't think it actually changes the content-model at all, it just makes the schema valid.

I think fixing the schema is a good thing, and will make many people attempting to validate WFS responses with strict validators (for example, xerces-j).

I agree not making users of xerces-j go crazy is a good thing... yet this is not the kind of issue we hear every other week, and I'm a bit
wary of changing otherwise standard OGC files... (thought we'll have to
do the same with versioned datastore anyways, adding a few more elements
to it).

Can we have a look at the patch before it hits trunk?

Cheers
Andrea

Andrea,

Sorry this one took me so long to get to.

I'm attaching the patch to this email and to the bug, but here it is in human-instructions format:

On line 273, 290 and 307 of trunk/web/src/main/webapp/schemas/gml/2.1.2/geometry.xsd (and ../../../sld/geometry.xsd) are lines like this:

<sequence>
<element ref="gml:SOMETHINGMember" maxOccurs="unbounded"/>

change these lines to:
<sequence maxOccurs="unbounded">
<element ref="gml:SOMETHINGMember" maxOccurs="1"/>

For whatever esoteric reason, the former is invalid, and the latter is valid. Or at least it tricks Xerces-j.

--saul

I agree not making users of xerces-j go crazy is a good thing... yet this is not the kind of issue we hear every other week, and I'm a bit
wary of changing otherwise standard OGC files... (thought we'll have to
do the same with versioned datastore anyways, adding a few more elements
to it).

Can we have a look at the patch before it hits trunk?

Cheers
Andrea

(attachments)

geometry.xsd.patch (1.47 KB)