[Geoserver-devel] GWC, trunk and mini resource/publish split

Hi,
GWC integration on trunk is not working anymore as a result
of the mini resource/publish split changes. I want to fix this,
but I think I need some context on the how.

The issue happens when GWC calls onto WMS via the dispatcher,
as GWC is building a fake servlet enviroment in which the dispatcher works and the following exception is thrown:

  at org.geowebcache.layer.wms.FakeHttpServletRequest.getRequestURI(FakeHttpServletRequest.java:112)
  at org.geoserver.ows.Dispatcher.init(Dispatcher.java:293)
  at org.geoserver.ows.Dispatcher.handleRequestInternal(Dispatcher.java:207)
  at org.springframework.web.servlet.mvc.AbstractController.handleRequest(AbstractController.java:153)
  at org.geoserver.ows.Dispatcher.handleRequest(Dispatcher.java:50001)
  at org.geowebcache.layer.wms.WMSGeoServerHelper.makeRequest(WMSGeoServerHelper.java:35)
  at org.geowebcache.layer.wms.WMSSourceHelper.makeRequest(WMSSourceHelper.java:62)
  at org.geowebcache.layer.wms.WMSGeoServerHelper.makeRequest(WMSGeoServerHelper.java:50001)

This happens as the dispatcher runs these lines:

         // parse the request path into two components. (1) the 'path' which
         // is the string after the last '/', and the 'context' which is the
         // string before the last '/'
         String ctxPath = request.httpRequest.getContextPath();
         String reqPath = request.httpRequest.getRequestURI();
         reqPath = reqPath.substring(ctxPath.length());

Now, those are used to get the context and, I guess, to decide what layers are visible eventually?
GWC FakeHttpServletRequst is throwing an exception when getContextURI()
is called, and fixing it is easy, the real question is, what is the
appropriate return value?

GWC does not have the concept of workspaces, so when a request
comes in, what happens? Should GWC build a fake request that does not
have a workspace in it? However, how that will affect the GWC ability to play with a GeoServer that actively uses workspaces for access control? Or should the GWC integration somehow mirror OWS services workspaces behavior?

Cheers
Andrea

--
Andrea Aime
OpenGeo - http://opengeo.org
Expert service straight from the developers.

getContextURI():
Put a debugging point in the code that invokes the method, then make a getmap request manually and see what value Jetty returns for that method. (Or we can check the servlet documentation). Will probably have to parameterize something, but should be easy to figure out. (Sorry, dont have a GS trunk build on my netbook).

Workspaces:
It could be tempting to have special workspace for GWC, as a way to select layers, but then I guess you lose the benefit of per-workspace access control. And it doesn't let you specify important stuff like formats / projections anyway... so lets not go there, I guess.

We could introduce workspaces in GWC, or at least write a wrapper for the TileLayerDispatcher (class in GWC) that inspects the HttpServletRequest for a workspace and refuses to return anything that the user is not supposed to get to.

-Arne

Andrea Aime wrote:

Hi,
GWC integration on trunk is not working anymore as a result
of the mini resource/publish split changes. I want to fix this,
but I think I need some context on the how.

The issue happens when GWC calls onto WMS via the dispatcher,
as GWC is building a fake servlet enviroment in which the dispatcher works and the following exception is thrown:

  at org.geowebcache.layer.wms.FakeHttpServletRequest.getRequestURI(FakeHttpServletRequest.java:112)
  at org.geoserver.ows.Dispatcher.init(Dispatcher.java:293)
  at org.geoserver.ows.Dispatcher.handleRequestInternal(Dispatcher.java:207)
  at org.springframework.web.servlet.mvc.AbstractController.handleRequest(AbstractController.java:153)
  at org.geoserver.ows.Dispatcher.handleRequest(Dispatcher.java:50001)
  at org.geowebcache.layer.wms.WMSGeoServerHelper.makeRequest(WMSGeoServerHelper.java:35)
  at org.geowebcache.layer.wms.WMSSourceHelper.makeRequest(WMSSourceHelper.java:62)
  at org.geowebcache.layer.wms.WMSGeoServerHelper.makeRequest(WMSGeoServerHelper.java:50001)

This happens as the dispatcher runs these lines:

         // parse the request path into two components. (1) the 'path' which
         // is the string after the last '/', and the 'context' which is the
         // string before the last '/'
         String ctxPath = request.httpRequest.getContextPath();
         String reqPath = request.httpRequest.getRequestURI();
         reqPath = reqPath.substring(ctxPath.length());

Now, those are used to get the context and, I guess, to decide what layers are visible eventually?
GWC FakeHttpServletRequst is throwing an exception when getContextURI()
is called, and fixing it is easy, the real question is, what is the
appropriate return value?

GWC does not have the concept of workspaces, so when a request
comes in, what happens? Should GWC build a fake request that does not
have a workspace in it? However, how that will affect the GWC ability to play with a GeoServer that actively uses workspaces for access control? Or should the GWC integration somehow mirror OWS services workspaces behavior?

Cheers
Andrea

--
Arne Kepp
OpenGeo - http://opengeo.org
Expert service straight from the developers

Interesting... and indeed something I did not take into account when doing the virtual service work. Shame on me for not testing gwc and shame on all of us for not having better gwc test coverage.

That said, if we consider gwc another type of service it probably makes sense to have it fall under the virtual service banner, and something that could be restricted to workspace. But for the time being it is considered something global and afaik something the will be affected even if the user decides to disable global services.

I would say in the short term simply set the parameter and ensure that gwc does not assume any particular uri structure. But as you say as soon as someone disables the global wms... boom!

In the medium term we should wrap gwc in the virtual service goodness. The first step to doing this is to use a OWSUrlHandlerMapping instead of the spring SimpleUrlHandlerMapping one for the dispatcher mapping. And then it is just a matter of following a request down and see where things fail, patch any reflective urls so they include the specified workspace, etc...

I think if we can avoid introducing the concept of workspace into gwc it would probably be a good idea. But I am not sure if it will be possible for gwc to pass the first part of the uri path through without having to make assumptions about its structure.

-Justin

On 3/1/10 1:41 PM, Andrea Aime wrote:

Hi,
GWC integration on trunk is not working anymore as a result
of the mini resource/publish split changes. I want to fix this,
but I think I need some context on the how.

The issue happens when GWC calls onto WMS via the dispatcher,
as GWC is building a fake servlet enviroment in which the dispatcher
works and the following exception is thrown:

  at
org.geowebcache.layer.wms.FakeHttpServletRequest.getRequestURI(FakeHttpServletRequest.java:112)
  at org.geoserver.ows.Dispatcher.init(Dispatcher.java:293)
  at org.geoserver.ows.Dispatcher.handleRequestInternal(Dispatcher.java:207)
  at
org.springframework.web.servlet.mvc.AbstractController.handleRequest(AbstractController.java:153)
  at org.geoserver.ows.Dispatcher.handleRequest(Dispatcher.java:50001)
  at
org.geowebcache.layer.wms.WMSGeoServerHelper.makeRequest(WMSGeoServerHelper.java:35)
  at
org.geowebcache.layer.wms.WMSSourceHelper.makeRequest(WMSSourceHelper.java:62)
  at
org.geowebcache.layer.wms.WMSGeoServerHelper.makeRequest(WMSGeoServerHelper.java:50001)

This happens as the dispatcher runs these lines:

          // parse the request path into two components. (1) the 'path' which
          // is the string after the last '/', and the 'context' which is
the
          // string before the last '/'
          String ctxPath = request.httpRequest.getContextPath();
          String reqPath = request.httpRequest.getRequestURI();
          reqPath = reqPath.substring(ctxPath.length());

Now, those are used to get the context and, I guess, to decide what
layers are visible eventually?
GWC FakeHttpServletRequst is throwing an exception when getContextURI()
is called, and fixing it is easy, the real question is, what is the
appropriate return value?

GWC does not have the concept of workspaces, so when a request
comes in, what happens? Should GWC build a fake request that does not
have a workspace in it? However, how that will affect the GWC ability to
play with a GeoServer that actively uses workspaces for access control?
   Or should the GWC integration somehow mirror OWS services workspaces
behavior?

Cheers
Andrea

--
Justin Deoliveira
OpenGeo - http://opengeo.org
Enterprise support for open source geospatial.

Justin Deoliveira ha scritto:

Interesting... and indeed something I did not take into account when doing the virtual service work. Shame on me for not testing gwc and shame on all of us for not having better gwc test coverage.

Well, let's fix it.

That said, if we consider gwc another type of service it probably makes sense to have it fall under the virtual service banner, and something that could be restricted to workspace. But for the time being it is considered something global and afaik something the will be affected even if the user decides to disable global services.

I would say in the short term simply set the parameter and ensure that gwc does not assume any particular uri structure. But as you say as soon as someone disables the global wms... boom!

In the medium term we should wrap gwc in the virtual service goodness. The first step to doing this is to use a OWSUrlHandlerMapping instead of the spring SimpleUrlHandlerMapping one for the dispatcher mapping. And then it is just a matter of following a request down and see where things fail, patch any reflective urls so they include the specified workspace, etc...

I tried to look into this directly.
The main issue I see is that GWC would have to become aware of the
workspaces. Atm the GeoWebCacheDispatcher (part of the GWC code)
receives a request like:

http://localhost:8080/geoserver/nurc/gwc/demo/nurc:Arc_Sample?gridSet=EPSG:900913&format=image/png

and returns complaining it does not know what /nurc/ is.

I'm not even sure that GWC, as a stand alone app, should be concerned about that at all.

Rough idea: register in the GeoServer integration a wrapper around
the GWC own dispatcher that:
- checks global services are on, and eventually refuses to answer
   the request if the workspace is not there
- strips the workspace from the URL before it reaches GWC, and stores
   it in a thread local
- makes the GWC->WMS direct path grab the workspace and reintegrate it
   into the request by using the indirections provided by
   FakeHttpServletRequest

How does this sound? Should not be much work and I'd rather do it sooner
instead of later.

One thing that would remain unchanged is the GWC demo page, which would
keep on listing all of the layers no matter what. Not sure we
can do anything about it unless GWC add a concept of layer filtering
that we can plug into.

Cheers
Andrea

--
Andrea Aime
OpenGeo - http://opengeo.org
Expert service straight from the developers.

On 3/2/10 6:38 AM, Andrea Aime wrote:

Justin Deoliveira ha scritto:

Interesting... and indeed something I did not take into account when
doing the virtual service work. Shame on me for not testing gwc and
shame on all of us for not having better gwc test coverage.

Well, let's fix it.

That said, if we consider gwc another type of service it probably
makes sense to have it fall under the virtual service banner, and
something that could be restricted to workspace. But for the time
being it is considered something global and afaik something the will
be affected even if the user decides to disable global services.

I would say in the short term simply set the parameter and ensure that
gwc does not assume any particular uri structure. But as you say as
soon as someone disables the global wms... boom!

In the medium term we should wrap gwc in the virtual service goodness.
The first step to doing this is to use a OWSUrlHandlerMapping instead
of the spring SimpleUrlHandlerMapping one for the dispatcher mapping.
And then it is just a matter of following a request down and see where
things fail, patch any reflective urls so they include the specified
workspace, etc...

I tried to look into this directly.
The main issue I see is that GWC would have to become aware of the
workspaces. Atm the GeoWebCacheDispatcher (part of the GWC code)
receives a request like:

http://localhost:8080/geoserver/nurc/gwc/demo/nurc:Arc_Sample?gridSet=EPSG:900913&format=image/png

and returns complaining it does not know what /nurc/ is.

Just to clarify the gwc dispatcher never gets hit right? The spring dispatcher should return a 404 in this case.

I'm not even sure that GWC, as a stand alone app, should be concerned
about that at all.

Rough idea: register in the GeoServer integration a wrapper around
the GWC own dispatcher that:
- checks global services are on, and eventually refuses to answer
the request if the workspace is not there
- strips the workspace from the URL before it reaches GWC, and stores
it in a thread local

One thing to note is that there is already a thread local workspace variable available called LocalWorkspace which can be used if need be. I am however not sure how the ows dispatcher callback that sets this variable will play with this if it is already set. I assume it probably just blindly overwrites it.

- makes the GWC->WMS direct path grab the workspace and reintegrate it
into the request by using the indirections provided by
FakeHttpServletRequest

How does this sound? Should not be much work and I'd rather do it sooner
instead of later.

I am still not really sure it is necessary to have gwc know about workspaces but I could be wrong here. The approach I would try is this:

- make the gwc dispatcher mapping an OwsUrlHandlerMapping so that it can handle url's of the form /nurc/gwc/**

- have the gwc dispatcher do its parsing of the request path in a more lax way. Rather than assuming a particular path structure just pop path components off the end of the path as need be.

- have the dispatcher preserve any initial parts of the request path so that it can properly create the hake http request with the workspace prefix

Now of course this does not handle the case in which gwc should not respond if global services are turned off. So some sort of check (in a wrapper) will be needed.

One thing that would remain unchanged is the GWC demo page, which would
keep on listing all of the layers no matter what. Not sure we
can do anything about it unless GWC add a concept of layer filtering
that we can plug into.

Cheers
Andrea

--
Justin Deoliveira
OpenGeo - http://opengeo.org
Enterprise support for open source geospatial.

Justin Deoliveira ha scritto:

On 3/2/10 6:38 AM, Andrea Aime wrote:

Justin Deoliveira ha scritto:

Interesting... and indeed something I did not take into account when
doing the virtual service work. Shame on me for not testing gwc and
shame on all of us for not having better gwc test coverage.

Well, let's fix it.

That said, if we consider gwc another type of service it probably
makes sense to have it fall under the virtual service banner, and
something that could be restricted to workspace. But for the time
being it is considered something global and afaik something the will
be affected even if the user decides to disable global services.

I would say in the short term simply set the parameter and ensure that
gwc does not assume any particular uri structure. But as you say as
soon as someone disables the global wms... boom!

In the medium term we should wrap gwc in the virtual service goodness.
The first step to doing this is to use a OWSUrlHandlerMapping instead
of the spring SimpleUrlHandlerMapping one for the dispatcher mapping.
And then it is just a matter of following a request down and see where
things fail, patch any reflective urls so they include the specified
workspace, etc...

I tried to look into this directly.
The main issue I see is that GWC would have to become aware of the
workspaces. Atm the GeoWebCacheDispatcher (part of the GWC code)
receives a request like:

http://localhost:8080/geoserver/nurc/gwc/demo/nurc:Arc_Sample?gridSet=EPSG:900913&format=image/png

and returns complaining it does not know what /nurc/ is.

Just to clarify the gwc dispatcher never gets hit right? The spring dispatcher should return a 404 in this case.

Sorry, I actually tried to use the OWSHandlerMapping so the
gwc dispatcher was being hit, but did not know what to make
of the incoming request due to /nurc/ being in the path

That's why I was talking about having to strip it out and so on.

I am still not really sure it is necessary to have gwc know about workspaces but I could be wrong here. The approach I would try is this:

- make the gwc dispatcher mapping an OwsUrlHandlerMapping so that it can handle url's of the form /nurc/gwc/**

- have the gwc dispatcher do its parsing of the request path in a more lax way. Rather than assuming a particular path structure just pop path components off the end of the path as need be.

We can try, it's a matter of knowing where the actual path starts.
Be too lax and we might start reporting back strange error messages
when the user makes invalid requests, or even valid requests
in corner cases.
Point in case, what happens if in GeoServer there is a workspace
called gwc, for example? Or one called services, wms, wmts, or
any other of the normal urls GWC parses?

Cheers
Andrea

--
Andrea Aime
OpenGeo - http://opengeo.org
Expert service straight from the developers.

On 3/2/10 8:52 AM, Andrea Aime wrote:

Justin Deoliveira ha scritto:

On 3/2/10 6:38 AM, Andrea Aime wrote:

Justin Deoliveira ha scritto:

Interesting... and indeed something I did not take into account when
doing the virtual service work. Shame on me for not testing gwc and
shame on all of us for not having better gwc test coverage.

Well, let's fix it.

That said, if we consider gwc another type of service it probably
makes sense to have it fall under the virtual service banner, and
something that could be restricted to workspace. But for the time
being it is considered something global and afaik something the will
be affected even if the user decides to disable global services.

I would say in the short term simply set the parameter and ensure that
gwc does not assume any particular uri structure. But as you say as
soon as someone disables the global wms... boom!

In the medium term we should wrap gwc in the virtual service goodness.
The first step to doing this is to use a OWSUrlHandlerMapping instead
of the spring SimpleUrlHandlerMapping one for the dispatcher mapping.
And then it is just a matter of following a request down and see where
things fail, patch any reflective urls so they include the specified
workspace, etc...

I tried to look into this directly.
The main issue I see is that GWC would have to become aware of the
workspaces. Atm the GeoWebCacheDispatcher (part of the GWC code)
receives a request like:

http://localhost:8080/geoserver/nurc/gwc/demo/nurc:Arc_Sample?gridSet=EPSG:900913&format=image/png

and returns complaining it does not know what /nurc/ is.

Just to clarify the gwc dispatcher never gets hit right? The spring
dispatcher should return a 404 in this case.

Sorry, I actually tried to use the OWSHandlerMapping so the
gwc dispatcher was being hit, but did not know what to make
of the incoming request due to /nurc/ being in the path

That's why I was talking about having to strip it out and so on.

I am still not really sure it is necessary to have gwc know about
workspaces but I could be wrong here. The approach I would try is this:

- make the gwc dispatcher mapping an OwsUrlHandlerMapping so that it
can handle url's of the form /nurc/gwc/**

- have the gwc dispatcher do its parsing of the request path in a more
lax way. Rather than assuming a particular path structure just pop
path components off the end of the path as need be.

We can try, it's a matter of knowing where the actual path starts.
Be too lax and we might start reporting back strange error messages
when the user makes invalid requests, or even valid requests
in corner cases.
Point in case, what happens if in GeoServer there is a workspace
called gwc, for example? Or one called services, wms, wmts, or
any other of the normal urls GWC parses?

Fair enough. This is an issue with virtual services in general. There is probably a way to trick the system into giving you a misleading error or have it dispatch an improper request when global services are turned off.

I don't think it is unreasonable to tell users to simply "don't do that". If we really want to enforce it we could set up catalog validation constraints to prevent such names from being specified.

Cheers
Andrea

--
Justin Deoliveira
OpenGeo - http://opengeo.org
Enterprise support for open source geospatial.