[Geoserver-users] Modifying geoserver config kills gwc seed threads

Since upgrading from geoserver 1.7.3 from 1.7.4 I've noticed something
strange. If I make any changes to the configuration in geoserver, such
as adding a new feature type or editing a style, any seeding threads I
have running in gwc will die with an error:

2009-05-07 20:38:58,709 ERROR [seed.MTSeeder] - Empty metatile, error
message: MimeType mismatch, expected image/png but got image/png8 from
http://localhost:8080/geoserver/wms?SERVICE=WMS&REQUEST=GetMap&VERSION=1.1.0&LAYERS=llmap:basemap&EXCEPTIONS=application/vnd.ogc.se_inimage&STYLES=&TRANSPARENT=FALSE&BGCOLOR=0x112266&format_options=antialias:full&FORMAT=image/png8&SRS=EPSG:4326&WIDTH=768&HEIGHT=768&BBOX=26.71875,-11.953125,28.828125,-9.84375

The layer in question was being seeded in image/png8 format. It
*looks* like gwc suddenly got confused and thought it was expecting an
image/png tile back and got upset when it got back the image/png8 tile
it asked for. Note that i am not editing any data related to the
layer being seeded. In fact just hitting "Apply" without actually
changing anything is enough to trigger it.

--
This message brought to you by Speed of Light Beer
When you're approaching infinite mass
It's Speed of Light time!

Quick explanation first:

image/png is actually the correct response. The format= parameter is not a mimetype, image/png8 is actually a hint that we want image/png with 8 bits. However, the content-type in the HTTP response has to be a mimetype, image/png8 is wrong because no browser has heard of this format.

I wasn't able to reproduce the problem exactly as described. But I got lucky and appended the 8 to the layername first (-> nonexistent layer), and then I got it too.

What appears to happen is that we render exceptions ( EXCEPTIONS=application/vnd.ogc.se_inimage ) in the same format as requested, but return those with the mimetype set to the format of the request.

Can you try requesting the same URL in a browser, but change it to image/png, and see if you get a sensible error message? Please feel free to open a JIRA issue, but I can also do it after I hear back from you.

Thank you,
-Arne

Joshua M. Thompson wrote:

Since upgrading from geoserver 1.7.3 from 1.7.4 I've noticed something
strange. If I make any changes to the configuration in geoserver, such
as adding a new feature type or editing a style, any seeding threads I
have running in gwc will die with an error:

2009-05-07 20:38:58,709 ERROR [seed.MTSeeder] - Empty metatile, error
message: MimeType mismatch, expected image/png but got image/png8 from
http://localhost:8080/geoserver/wms?SERVICE=WMS&REQUEST=GetMap&VERSION=1.1.0&LAYERS=llmap:basemap&EXCEPTIONS=application/vnd.ogc.se_inimage&STYLES=&TRANSPARENT=FALSE&BGCOLOR=0x112266&format_options=antialias:full&FORMAT=image/png8&SRS=EPSG:4326&WIDTH=768&HEIGHT=768&BBOX=26.71875,-11.953125,28.828125,-9.84375

The layer in question was being seeded in image/png8 format. It
*looks* like gwc suddenly got confused and thought it was expecting an
image/png tile back and got upset when it got back the image/png8 tile
it asked for. Note that i am not editing any data related to the
layer being seeded. In fact just hitting "Apply" without actually
changing anything is enough to trigger it.

--
Arne Kepp
OpenGeo - http://opengeo.org
Expert service straight from the developers

On Sat, May 9, 2009 at 10:59 PM, Arne Kepp <ak@anonymised.com> wrote:

Quick explanation first:

image/png is actually the correct response. The format= parameter is not
a mimetype, image/png8 is actually a hint that we want image/png with 8
bits. However, the content-type in the HTTP response has to be a
mimetype, image/png8 is wrong because no browser has heard of this format.

I wasn't able to reproduce the problem exactly as described. But I got
lucky and appended the 8 to the layername first (-> nonexistent layer),
and then I got it too.

What appears to happen is that we render exceptions (
EXCEPTIONS=application/vnd.ogc.se_inimage ) in the same format as
requested, but return those with the mimetype set to the format of the
request.

Can you try requesting the same URL in a browser, but change it to
image/png, and see if you get a sensible error message? Please feel free
to open a JIRA issue, but I can also do it after I hear back from you.

Well the URL is normally valid, so loading it in a browser just
produces the 768x768 metatile. The error only occurs if the server
configuration was changed in any way while the request was being
processed. I tried to make something happen manually by switching tabs
and hitting Apply as fast as I could, and I never got an error back.
What i *did* get however is an image with one of the two layers
missing. Specifically, the shapefile layer appears to draw fine but
the oracle_ng based layer disappears. The response headers appeared to
be completely normal This actually makes me glad gwc bombs out rather
than writing bad tiles into the cache.

--
This message brought to you by Speed of Light Beer
When you're approaching infinite mass
It's Speed of Light time!

Joshua M. Thompson wrote:

On Sat, May 9, 2009 at 10:59 PM, Arne Kepp <ak@anonymised.com> wrote:
  

Quick explanation first:

image/png is actually the correct response. The format= parameter is not
a mimetype, image/png8 is actually a hint that we want image/png with 8
bits. However, the content-type in the HTTP response has to be a
mimetype, image/png8 is wrong because no browser has heard of this format.

I wasn't able to reproduce the problem exactly as described. But I got
lucky and appended the 8 to the layername first (-> nonexistent layer),
and then I got it too.

What appears to happen is that we render exceptions (
EXCEPTIONS=application/vnd.ogc.se_inimage ) in the same format as
requested, but return those with the mimetype set to the format of the
request.

Can you try requesting the same URL in a browser, but change it to
image/png, and see if you get a sensible error message? Please feel free
to open a JIRA issue, but I can also do it after I hear back from you.
    
Well the URL is normally valid, so loading it in a browser just
produces the 768x768 metatile. The error only occurs if the server
configuration was changed in any way while the request was being
processed. I tried to make something happen manually by switching tabs
and hitting Apply as fast as I could, and I never got an error back.
What i *did* get however is an image with one of the two layers
missing. Specifically, the shapefile layer appears to draw fine but
the oracle_ng based layer disappears. The response headers appeared to
be completely normal This actually makes me glad gwc bombs out rather
than writing bad tiles into the cache.

Hm, looks like we have a heisenbug :frowning:

The GeoTools renderer has this unfortunate* behavior that it renders on a best effort basis and returns whatever it has when the error occurs. GeoServer can't tell when this has happened, and consequently GeoWebCache cannot either, so that it has not cached any partially rendered image should be attributed to luck.

My theory: The oracle_ng datastore or server is having connection problems or running into a bug. Depending on the circumstances, this can result in a partially rendered image, or in an outright exception (inimage exception -> image/png8). I would be surprised if we indeed lack synchronization to protect configuration changes, so how fast you click it is probably not important. There is some discussion of jdbc_ng and events on the GeoTools list right now, but it's probably unrelated.

Can you please carefully check the logs to look for jdbc_ng related exceptions? If there are none, can you enable verbose exceptions in the configuration and switch to the geoserver developer mode, then try to recreate the problem. I'm CCing Justin and Andrea, who know more about jdbc_ng and Oracle than I do, and I've created http://jira.codehaus.org/browse/GEOS-3018

-Arne

*: My personal opinion, other projects find it beneficial and see this as a feature.

--
Arne Kepp
OpenGeo - http://opengeo.org
Expert service straight from the developers

Arne Kepp ha scritto:

Hm, looks like we have a heisenbug :frowning:

The GeoTools renderer has this unfortunate* behavior that it renders on a best effort basis and returns whatever it has when the error occurs. GeoServer can't tell when this has happened, and consequently GeoWebCache cannot either, so that it has not cached any partially rendered image should be attributed to luck.

Yep, it has been setup this way to render (unfortunately very common)
crap data, or at least data that does contain a certain amount of
invalid geometries.

My theory: The oracle_ng datastore or server is having connection problems or running into a bug. Depending on the circumstances, this can result in a partially rendered image, or in an outright exception (inimage exception -> image/png8). I would be surprised if we indeed lack synchronization to protect configuration changes, so how fast you click it is probably not important. There is some discussion of jdbc_ng and events on the GeoTools list right now, but it's probably unrelated.

>

Can you please carefully check the logs to look for jdbc_ng related exceptions? If there are none, can you enable verbose exceptions in the configuration and switch to the geoserver developer mode, then try to recreate the problem. I'm CCing Justin and Andrea, who know more about jdbc_ng and Oracle than I do, and I've created http://jira.codehaus.org/browse/GEOS-3018

The issue is known. When the configuration is applied, the datastores
are destroyed and recreated fully, since in 1.7.x there is no way to
tell _what_ changed in the configuration (we only know something
changed).
When that happens, the connection pools are also closed, so the
renderer is suddenly working against a connection that has been
closed, as a result, all requests against a database that
are still in need of loading data from the db will fail in
some way or the other.

In 2.0.x only the parts of the config that changed are saved on
submit, so unless you're touching the datastore configuration
itself, this won't happen.

Moreover, even with 1.7.x this _might_ not happen if we used
connection pools stored in JNDI, that are not closed along the
datastore during the configuration reload (I say might because
closing down the datastore wipes out other data structures
as well, there is no guarantee those are not going to
be needed, and I never actually tried).
This is apparently going to be resolved soon enough:
http://jira.codehaus.org/browse/GEOT-2475

Cheers
Andrea

--
Andrea Aime
OpenGeo - http://opengeo.org
Expert service straight from the developers.