[Geoserver-devel] XMLSAXHandler questions

Hi,

I’m looking at an issue with parsing of some poorly formed WFS GetFeature XML responses. We’re finding a “gml:metaDataProperty” element in a WFS-1.0.0/GML-2.1.2 response. “gml:metaDataProperty” was not introduced until GML 3 so the response will not validate. This is a GeoServer generated response, but I am more concerned about the parsing for now…

An exception is being generated in org.geotools.xml.XMLSAXHandler.startElement(…) on the call to obtain the element handler for “gml.metaDataProperty”, the log entry is as follows.

org.geotools.XMLSAXHandler processException
SEVERE: Could not find element handler for http://www.opengis.net/gml : metaDataProperty as a child of FeatureCollectionType.
org.geotools.xml.handlers.ComplexElementHandler.getHandler(ComplexElementHandler.java:572)
org.geotools.xml.XMLSAXHandler.startElement(XMLSAXHandler.java:411)
org.apache.xerces…

The code that’s generating the exception is as follows:

logger.finest("This Node = " + localName + " :: " + namespaceURI);
URI uri = new URI(namespaceURI);
XMLElementHandler eh = parent.getHandler(uri,
localName, hints);

if (eh == null) {
eh = new IgnoreHandler();
}

logger.finest("This Node = " + eh.getClass().getName());

The issue here is that instead of a null element handler being returned an exception is being thrown and document parsing stops.

Would it make sense to wrap the getHandler(…) call in a try/catch, log the error state then let the logic fall through the use of the IngoreHandler for parsing of that element? Thoughts?

I don’t know the history of the code but this appears to be the intent of the original design.

Thanks,

Tom Kunicki
Software Engineer | Boundless
tkunicki@anonymised.com
917-460-7212
@boundless

Hey Tom,

···

On Fri, Dec 20, 2013 at 3:37 PM, Tom Kunicki <tkunicki@anonymised.com> wrote:

Hi,

I’m looking at an issue with parsing of some poorly formed WFS GetFeature XML responses. We’re finding a “gml:metaDataProperty” element in a WFS-1.0.0/GML-2.1.2 response. “gml:metaDataProperty” was not introduced until GML 3 so the response will not validate. This is a GeoServer generated response, but I am more concerned about the parsing for now…

This is odd… if its there then that is definitely a bug, one worth

An exception is being generated in org.geotools.xml.XMLSAXHandler.startElement(…) on the call to obtain the element handler for “gml.metaDataProperty”, the log entry is as follows.

org.geotools.XMLSAXHandler processException
SEVERE: Could not find element handler for http://www.opengis.net/gml : metaDataProperty as a child of FeatureCollectionType.
org.geotools.xml.handlers.ComplexElementHandler.getHandler(ComplexElementHandler.java:572)
org.geotools.xml.XMLSAXHandler.startElement(XMLSAXHandler.java:411)
org.apache.xerces…

The code that’s generating the exception is as follows:

logger.finest("This Node = " + localName + " :: " + namespaceURI);
URI uri = new URI(namespaceURI);
XMLElementHandler eh = parent.getHandler(uri,
localName, hints);

if (eh == null) {
eh = new IgnoreHandler();
}

logger.finest("This Node = " + eh.getClass().getName());

The issue here is that instead of a null element handler being returned an exception is being thrown and document parsing stops.

Would it make sense to wrap the getHandler(…) call in a try/catch, log the error state then let the logic fall through the use of the IngoreHandler for parsing of that element? Thoughts?

I don’t know the history of the code but this appears to be the intent of the original design.

This one is tough. I am not that familiar with this code so might be off here but from what i remember the intent of the ignore handler is to handle elements that are valid with respect to the schema, but that we actually don’t really care about. For example there are things in gml that we don’t really represent with the geotools feature model so rather than error out we just ignore them. This is different imo than being lax about an invalid gml document.

In the next generation of the geotools gml parser (which was inspired by this code) we take the stance of being explicitly lax about stuff like this, but offer a flag to force the parser to be strict. Maybe something like that is needed here.

Thanks,

Tom Kunicki
Software Engineer | Boundless
tkunicki@anonymised.com
917-460-7212
@boundless


Rapidly troubleshoot problems before they affect your business. Most IT
organizations don’t have a clear picture of how application performance
affects their revenue. With AppDynamics, you get 100% visibility into your
Java,.NET, & PHP application. Start your 15-day FREE TRIAL of AppDynamics Pro!
http://pubads.g.doubleclick.net/gampad/clk?id=84349831&iu=/4140/ostg.clktrk


Geoserver-devel mailing list
Geoserver-devel@anonymised.comt
https://lists.sourceforge.net/lists/listinfo/geoserver-devel

Justin Deoliveira
Vice President, Engineering | Boundless
jdeolive@anonymised.com
@j_deolive

On Mon, Jan 6, 2014 at 12:18 PM, Justin Deoliveira <
jdeolive@anonymised.com> wrote:

Hey Tom,

On Fri, Dec 20, 2013 at 3:37 PM, Tom Kunicki <tkunicki@anonymised.com>wrote:

Hi,

I’m looking at an issue with parsing of some poorly formed WFS GetFeature
XML responses. We’re finding a “gml:metaDataProperty” element in a
WFS-1.0.0/GML-2.1.2 response. “gml:metaDataProperty” was not introduced
until GML 3 so the response will not validate. This is a GeoServer
generated response, but I am more concerned about the parsing for now...

This is odd... if its there then that is definitely a bug, one worth

After lots of digging it looks like there's some app-schema configuration
that's allowing GML 3 elements to sneak into the GML 2 response. But this
situation where a service produces a response that won't validate isn't as
rare as one would hope. This is a 2.1.x server instance that can't be
updated.

An exception is being generated in
org.geotools.xml.XMLSAXHandler.startElement(…) on the call to obtain the
element handler for “gml.metaDataProperty”, the log entry is as follows.

org.geotools.XMLSAXHandler processException
SEVERE: Could not find element handler for http://www.opengis.net/gml :
metaDataProperty as a child of FeatureCollectionType.

org.geotools.xml.handlers.ComplexElementHandler.getHandler(ComplexElementHandler.java:572)
org.geotools.xml.XMLSAXHandler.startElement(XMLSAXHandler.java:411)
org.apache.xerces...

The code that’s generating the exception is as follows:

            logger.finest("This Node = " + localName + " :: " +
namespaceURI);
            URI uri = new URI(namespaceURI);
            XMLElementHandler eh = parent.getHandler(uri,
                    localName, hints);

            if (eh == null) {
                eh = new IgnoreHandler();
            }

            logger.finest("This Node = " + eh.getClass().getName());

The issue here is that instead of a null element handler being returned
an exception is being thrown and document parsing stops.

*Would it make sense to wrap the getHandler(…) call in a try/catch, log
the error state then let the logic fall through the use of the
IngoreHandler for parsing of that element? Thoughts?*

I don’t know the history of the code but this appears to be the intent of
the original design.

This one is tough. I am not that familiar with this code so might be off
here but from what i remember the intent of the ignore handler is to handle
elements that are valid with respect to the schema, but that we actually
don't really care about. For example there are things in gml that we don't
really represent with the geotools feature model so rather than error out
we just ignore them. This is different imo than being lax about an invalid
gml document.

In the next generation of the geotools gml parser (which was inspired by
this code) we take the stance of being explicitly lax about stuff like
this, but offer a flag to force the parser to be strict. Maybe something
like that is needed here.

A flag is a great idea, in this case maybe it should be used to enable lax
parsing as strict is the current behavior. I'll look at prior
geotools/geoserver flag usage and submit a PR.

Thanks,

--
Tom Kunicki
Software Engineer | Boundless
tkunicki@anonymised.com
917-460-7212
@tomkunicki

On Mon, Jan 6, 2014 at 12:43 PM, Tom Kunicki <tkunicki@anonymised.com>wrote:

On Mon, Jan 6, 2014 at 12:18 PM, Justin Deoliveira <
jdeolive@anonymised.com> wrote:

Hey Tom,

On Fri, Dec 20, 2013 at 3:37 PM, Tom Kunicki <tkunicki@anonymised.com>wrote:

Hi,

I’m looking at an issue with parsing of some poorly formed WFS
GetFeature XML responses. We’re finding a “gml:metaDataProperty” element
in a WFS-1.0.0/GML-2.1.2 response. “gml:metaDataProperty” was not
introduced until GML 3 so the response will not validate. This is a
GeoServer generated response, but I am more concerned about the parsing for
now...

This is odd... if its there then that is definitely a bug, one worth

After lots of digging it looks like there's some app-schema configuration
that's allowing GML 3 elements to sneak into the GML 2 response. But this
situation where a service produces a response that won't validate isn't as
rare as one would hope. This is a 2.1.x server instance that can't be
updated.

Hmmm... this is still odd. Because metaDataProperty on a feature in GML2 is
just wrong. And if GML3 output is creeping in the GML2 parser is going to
have a heap of problems.

@Ben: any thoughts here?

An exception is being generated in
org.geotools.xml.XMLSAXHandler.startElement(…) on the call to obtain the
element handler for “gml.metaDataProperty”, the log entry is as follows.

org.geotools.XMLSAXHandler processException
SEVERE: Could not find element handler for http://www.opengis.net/gml :
metaDataProperty as a child of FeatureCollectionType.

org.geotools.xml.handlers.ComplexElementHandler.getHandler(ComplexElementHandler.java:572)
org.geotools.xml.XMLSAXHandler.startElement(XMLSAXHandler.java:411)
org.apache.xerces...

The code that’s generating the exception is as follows:

            logger.finest("This Node = " + localName + " :: " +
namespaceURI);
            URI uri = new URI(namespaceURI);
            XMLElementHandler eh = parent.getHandler(uri,
                    localName, hints);

            if (eh == null) {
                eh = new IgnoreHandler();
            }

            logger.finest("This Node = " + eh.getClass().getName());

The issue here is that instead of a null element handler being returned
an exception is being thrown and document parsing stops.

*Would it make sense to wrap the getHandler(…) call in a try/catch, log
the error state then let the logic fall through the use of the
IngoreHandler for parsing of that element? Thoughts?*

I don’t know the history of the code but this appears to be the intent
of the original design.

This one is tough. I am not that familiar with this code so might be off
here but from what i remember the intent of the ignore handler is to handle
elements that are valid with respect to the schema, but that we actually
don't really care about. For example there are things in gml that we don't
really represent with the geotools feature model so rather than error out
we just ignore them. This is different imo than being lax about an invalid
gml document.

In the next generation of the geotools gml parser (which was inspired by
this code) we take the stance of being explicitly lax about stuff like
this, but offer a flag to force the parser to be strict. Maybe something
like that is needed here.

A flag is a great idea, in this case maybe it should be used to enable lax
parsing as strict is the current behavior. I'll look at prior
geotools/geoserver flag usage and submit a PR.

Yeah, my thinking too was to leave the default behaviour as is for
backwards compatibility purposes.

Thanks,

--
Tom Kunicki
Software Engineer | Boundless
tkunicki@anonymised.com
917-460-7212
@tomkunicki

--
*Justin Deoliveira*
Vice President, Engineering | Boundless
jdeolive@anonymised.com
@j_deolive <https://twitter.com/j_deolive&gt;

Hey,

To bring everyone up to speed with the current status. Justin and I looked into the samples our client gave us and it appears the upstream (cascaded) WFS server is a MapServer instance. So effectively we’re seeing a WFS 1.1 DescribeFeature call in the xsi:schemaLocation attribute in a MapServer WFS 1.0 GetFeature response. I incorrectly stated that I thought there was an issue with GeoServer’s app-schema response based on information I had.

There still needs to be some loosening in the GeoTools GML parser to become a little more resilient in the presence of poorly formed responses.

Tom Kunicki
Software Engineer | Boundless
tkunicki@anonymised.com
917-460-7212
@boundless

···

On Mon, Jan 6, 2014 at 12:43 PM, Tom Kunicki <tkunicki@anonymised.com.3839…> wrote:

Hmmm… this is still odd. Because metaDataProperty on a feature in GML2 is just wrong. And if GML3 output is creeping in the GML2 parser is going to have a heap of problems.

@Ben: any thoughts here?

Yeah, my thinking too was to leave the default behaviour as is for backwards compatibility purposes.

Justin Deoliveira
Vice President, Engineering | Boundless
jdeolive@anonymised.com
@j_deolive

On Mon, Jan 6, 2014 at 12:18 PM, Justin Deoliveira <jdeolive@anonymised.com9…> wrote:

Hey Tom,

After lots of digging it looks like there’s some app-schema configuration that’s allowing GML 3 elements to sneak into the GML 2 response. But this situation where a service produces a response that won’t validate isn’t as rare as one would hope. This is a 2.1.x server instance that can’t be updated.

On Fri, Dec 20, 2013 at 3:37 PM, Tom Kunicki <tkunicki@anonymised.com> wrote:

Hi,

I’m looking at an issue with parsing of some poorly formed WFS GetFeature XML responses. We’re finding a “gml:metaDataProperty” element in a WFS-1.0.0/GML-2.1.2 response. “gml:metaDataProperty” was not introduced until GML 3 so the response will not validate. This is a GeoServer generated response, but I am more concerned about the parsing for now…

This is odd… if its there then that is definitely a bug, one worth

A flag is a great idea, in this case maybe it should be used to enable lax parsing as strict is the current behavior. I’ll look at prior geotools/geoserver flag usage and submit a PR.

An exception is being generated in org.geotools.xml.XMLSAXHandler.startElement(…) on the call to obtain the element handler for “gml.metaDataProperty”, the log entry is as follows.

org.geotools.XMLSAXHandler processException
SEVERE: Could not find element handler for http://www.opengis.net/gml : metaDataProperty as a child of FeatureCollectionType.
org.geotools.xml.handlers.ComplexElementHandler.getHandler(ComplexElementHandler.java:572)
org.geotools.xml.XMLSAXHandler.startElement(XMLSAXHandler.java:411)
org.apache.xerces…

The code that’s generating the exception is as follows:

logger.finest("This Node = " + localName + " :: " + namespaceURI);
URI uri = new URI(namespaceURI);
XMLElementHandler eh = parent.getHandler(uri,
localName, hints);

if (eh == null) {
eh = new IgnoreHandler();
}

logger.finest("This Node = " + eh.getClass().getName());

The issue here is that instead of a null element handler being returned an exception is being thrown and document parsing stops.

Would it make sense to wrap the getHandler(…) call in a try/catch, log the error state then let the logic fall through the use of the IngoreHandler for parsing of that element? Thoughts?

I don’t know the history of the code but this appears to be the intent of the original design.

This one is tough. I am not that familiar with this code so might be off here but from what i remember the intent of the ignore handler is to handle elements that are valid with respect to the schema, but that we actually don’t really care about. For example there are things in gml that we don’t really represent with the geotools feature model so rather than error out we just ignore them. This is different imo than being lax about an invalid gml document.

In the next generation of the geotools gml parser (which was inspired by this code) we take the stance of being explicitly lax about stuff like this, but offer a flag to force the parser to be strict. Maybe something like that is needed here.

Thanks,

Tom Kunicki
Software Engineer | Boundless
tkunicki@anonymised.com
917-460-7212

@tomkunicki

Ouch!

That could work if you have

outputFormat="text/xml; subtype=gml/2.1.2"

to get a GML2 application schema in the MapServer WFS 1.1 DescribeFeature response. Legal but not moral!

Kind regards,
Ben.

On 08/01/14 03:07, Tom Kunicki wrote:

So effectively we’re seeing a WFS 1.1 DescribeFeature call in the
xsi:schemaLocation attribute in a MapServer WFS 1.0 GetFeature response.

--
Ben Caradoc-Davies <Ben.Caradoc-Davies@anonymised.com>
Software Engineer
CSIRO Earth Science and Resource Engineering
Australian Resources Research Centre