[Geoserver-devel] the servlet-mapping problem

Hi all,

First off a warning that this is a long email :slight_smile:

In doing the virtual service work (GSIP 44) i have yet again have run up against a problem we have had in GeoServer ever since we moved to spring. The problem being that we have to register all servlet mappings in web.xml.

To describe the problem in more detail the following is how a request is routed/dispatched in GeoServer today. In web.xml we register a single servlet (DispatcherServlet) which is the spring dispatcher servlet. This servlet essentially handles all requests that come into geoserver. But it can only do so if the proper servlet mappings are registered web.xml. For example:

     <servlet-mapping>
         <servlet-name>dispatcher</servlet-name>
         <url-pattern>/web/*</url-pattern>
     </servlet-mapping>
     <servlet-mapping>
       <servlet-name>dispatcher</servlet-name>
       <url-pattern>/rest/*</url-pattern>
     </servlet-mapping>
     <servlet-mapping>
       <servlet-name>dispatcher</servlet-name>
       <url-pattern>/wms/*</url-pattern>
     </servlet-mapping>
     <servlet-mapping>
       <servlet-name>dispatcher</servlet-name>
       <url-pattern>/wfs/*</url-pattern>
     </servlet-mapping>

With these mappings a request that comes in of the form "/geoserver/wfs" for example, will get routed to our one servlet to rule them all, the spring dispatcher.

When handling a request the spring dispatcher takes the request and does its down routing. It does this by looking up mappings in the spring applicationContext.xml. For example the wfs mappings:

<bean id="wfsURLMapping" ...>
   <property name="alwaysUseFullPath" value="true"/>
  <property name="mappings">
    <props>
      <prop key="/wfs">dispatcher</prop>
      <prop key="/wfs/*">dispatcher</prop>
    </props>
  </property>
</bean>

Which basically says map all urls of the form "/wfs*" to the ows dispatcher.

Now there are a couple of problems with this approach:

1) It is redundant in that we have to maintain two sets of mappings.

2) New services are not truly pluggable because any time one adds a new path prefix web.xml has to be updated.

3) Mapping in web.xml is quite limited compared to the corresponding spring mappings

(3) is where my problem lies with regard to the virtual services stuff. To sum up with virtual services the path of a wfs request changes to:

/geoserver/<workspace>/wfs?

Which of course does not have a mapping in web.xml therefore the request does not even make it to the spring dispatcher.

So the obvious solution (one that we have tried) would be to create a single mapping in web.xml that matches all requests to the spring dispatcher. The syntax is quite simple:

     <servlet-mapping>
         <servlet-name>dispatcher</servlet-name>
         <url-pattern>/*</url-pattern>
     </servlet-mapping>

But it does not work. The reason being that when this mapping is applied some attributes of the resulting HttpServletRequest change. Namely the "servletPath" property.

With the existing mappings when a request comes in the servlet path gets set to the part of the path that matched. For example "web", "rest", "wfs", etc... But with the "catch all" mapping the servlet path becomes an empty string. I should also point out that this is the behavior in Jetty and Tomcat, i have yet to try out other containers.

And this causes problem. Many of the libraries/servets we use depend on the servlet path being set. Wicket and restlet pretty much fail outright. So.... the evil plan to fix. I tried a variety fo things but the following is only one i have had any success with.

Basically the idea is simple. Wrap the HttpServletRequest object in one that fakes the servlet path. Since all the existing mappings are simple in that they match a single patch component the wrapper class simply expects the entire request uri and uses the first component of the path (that occurs after /geoserver) and uses that for the servlet path.

So for wicket and restlet things go on working the same as the servlet paths become "web" and "rest" the way they were before. And same goes for the ows services.

Things get interesting when we get to the virtual services use case. Since the path is (for example):

/geoserver/<workspace>/wfs

The servlet path becomes "<workspace>". To amend this the "second level mappings" in the application context have to change a bit:

<bean id="wfsURLMapping" ...>
   <property name="alwaysUseFullPath" value="true"/>
  <property name="mappings">
    <props>
      <prop key="/wfs">dispatcher</prop>
      <prop key="/wfs/*">dispatcher</prop>
      <prop key="/*/wfs">dispatcher</prop>
      <prop key="/*/wfs/*">dispatcher</prop>
    </props>
  </property>
</bean>

And with that it all works. So what do we do. I will be the first to admit that this approach is a bit hackish. And I plan to do a great deal of more testing in various environments to ensure it is actually even viable. But if it is what do people think about it?

If the idea does get some uptake I was thinking we could gradually introduce it as we have some of the other recent functionality improvements such as advanced rendering functionality, etc... Basically add a parameter (settable as system prop, web.xml context variable, etc..) called "ADVANCED_DISPATCH" or something that would enable the above. We could enable it on trunk and give it time to mature but disable it by default on 2.0.x, making it available only by explicitly setting the parameter.

Thoughts?

-Justin

--
Justin Deoliveira
OpenGeo - http://opengeo.org
Enterprise support for open source geospatial.

Hi,

interesting problem

As far as I can tell, we're not forced to keep /<context>/wfs, /<context>/wms, etc. The service entry points are really stated in the capabilities and /<context>/wfs is actually a shortcut for /<context>/ows?SERVICE=WFS, right? That is, no application _should_ be hardcoded to hit /<context>/wfs. If at all, the canonical entry point should be /<context>/ows.

So, as the OWS Dispatcher is there precisely to dispatch OWS requests, why don't we invert the mapping for "/<context>/ows" being the single entry point and /<context>/ows/wfs, for example, being the shortcut for /<context>/ows?service=WFS?

That would allow you to do whatever you want from /<context>/ows onwards, like /<context>/ows/<workspace>/wfs, etc.

It sounds good to me, the only catch being possible applications hardcoded to hit /<context>/wfs?

What do you think?

2c.-

Gabriel

Gabriel Roldan wrote:

Hi,

interesting problem

As far as I can tell, we're not forced to keep /<context>/wfs, /<context>/wms, etc. The service entry points are really stated in the capabilities and /<context>/wfs is actually a shortcut for /<context>/ows?SERVICE=WFS, right? That is, no application _should_ be hardcoded to hit /<context>/wfs. If at all, the canonical entry point should be /<context>/ows.

True but I can guarantee that there are many clients and applications that don't do this and go straight to a specific operation.

So, as the OWS Dispatcher is there precisely to dispatch OWS requests, why don't we invert the mapping for "/<context>/ows" being the single entry point and /<context>/ows/wfs, for example, being the shortcut for /<context>/ows?service=WFS?

I am not quite sure I understand what you are getting at. I agree that the "wms", "wfs", and "wcs" mappings can be considered as aliases for "ows". But that seems sort of orthogonal to the problem. Even if we disallowed them we would still have the same problem just one level up not being able to add services, reslets, some other applications directly under the root context.

That would allow you to do whatever you want from /<context>/ows onwards, like /<context>/ows/<workspace>/wfs, etc.

It sounds good to me, the only catch being possible applications hardcoded to hit /<context>/wfs?

What do you think?

With this approach we would still be hardcoding the set of "root paths". For instance consider adding a new restful service. It has to go under "/rest" or "/api". While this might not seem such bad restriction to some, to others it would seem strange. Some of the people doing javascript work that have to consume a geoserver restful service have commented on this before.

Another example would be someone who wanted to experiment with a new sort of web front end. They would have to create a mapping in web.xml that is there only for the sole purpose of the extension.

2c.-

Gabriel

--
Justin Deoliveira
OpenGeo - http://opengeo.org
Enterprise support for open source geospatial.

So you're chasing for a general solution to avoid servlet mappings in web.xml? I thought it was just about the virtual services being hard to map as conceived because it would force the /* mapping to ows.

So I meant you can map /ows/* to the dispatcher and from that on introduce the virtual services

I am not quite sure I understand what you are getting at. I agree that the "wms", "wfs", and "wcs" mappings can be considered as aliases for "ows". But that seems sort of orthogonal to the problem. Even if we disallowed them we would still have the same problem just one level up not being able to add services, reslets, some other applications directly under the root context.

hmm I don't see why mapping /ows/* to the Dispatcher would avoid adding /whatever in the future?

That would allow you to do whatever you want from /<context>/ows onwards, like /<context>/ows/<workspace>/wfs, etc.

It sounds good to me, the only catch being possible applications hardcoded to hit /<context>/wfs?

What do you think?

With this approach we would still be hardcoding the set of "root paths". For instance consider adding a new restful service. It has to go under "/rest" or "/api".

So it looks like I was not aware the virtual services affect rest too? do you need /<workspace>/rest the same you need /<workspace>/wfs?

Anyhow, if I just misunderstood it forget about that.

Cheers,
Gabriel

While this might not seem such bad restriction to

some, to others it would seem strange. Some of the people doing javascript work that have to consume a geoserver restful service have commented on this before.

Another example would be someone who wanted to experiment with a new sort of web front end. They would have to create a mapping in web.xml that is there only for the sole purpose of the extension.

2c.-

Gabriel

--
Gabriel Roldan
OpenGeo - http://opengeo.org
Expert service straight from the developers.

The thing is, mapping /*/wfs/* might also mean trouble if any other resource has a "wfs" step?

Gabriel Roldan wrote:

So you're chasing for a general solution to avoid servlet mappings in web.xml? I thought it was just about the virtual services being hard to map as conceived because it would force the /* mapping to ows.

So I meant you can map /ows/* to the dispatcher and from that on introduce the virtual services

I am not quite sure I understand what you are getting at. I agree that the "wms", "wfs", and "wcs" mappings can be considered as aliases for "ows". But that seems sort of orthogonal to the problem. Even if we disallowed them we would still have the same problem just one level up not being able to add services, reslets, some other applications directly under the root context.

hmm I don't see why mapping /ows/* to the Dispatcher would avoid adding /whatever in the future?

That would allow you to do whatever you want from /<context>/ows onwards, like /<context>/ows/<workspace>/wfs, etc.

It sounds good to me, the only catch being possible applications hardcoded to hit /<context>/wfs?

What do you think?

With this approach we would still be hardcoding the set of "root paths". For instance consider adding a new restful service. It has to go under "/rest" or "/api".

So it looks like I was not aware the virtual services affect rest too? do you need /<workspace>/rest the same you need /<workspace>/wfs?

Anyhow, if I just misunderstood it forget about that.

Cheers,
Gabriel

While this might not seem such bad restriction to

some, to others it would seem strange. Some of the people doing javascript work that have to consume a geoserver restful service have commented on this before.

Another example would be someone who wanted to experiment with a new sort of web front end. They would have to create a mapping in web.xml that is there only for the sole purpose of the extension.

2c.-

Gabriel

--
Gabriel Roldan
OpenGeo - http://opengeo.org
Expert service straight from the developers.

Gabriel Roldan wrote:

So you're chasing for a general solution to avoid servlet mappings in web.xml? I thought it was just about the virtual services being hard to map as conceived because it would force the /* mapping to ows.

Yes, i am looking in general although my immediate problem is only for ows services.

So I meant you can map /ows/* to the dispatcher and from that on introduce the virtual services

Right, but if you look at the way the proposal was written you will notice the "virtual service"/workspace comes before the ows/wfs/wms/etc...

/geoserver/topp/ows?service=wfs&...

And yes I agree that for this *specific* problem we could just invert them. But aesthetically I like the url structure the way as written in the proposal so I am interested in pursuing this solution. Inverting the urls to /geoserver/ows/topp is a fallback i am considering if this proposed solution goes nowhere.

I am not quite sure I understand what you are getting at. I agree that the "wms", "wfs", and "wcs" mappings can be considered as aliases for "ows". But that seems sort of orthogonal to the problem. Even if we disallowed them we would still have the same problem just one level up not being able to add services, reslets, some other applications directly under the root context.

hmm I don't see why mapping /ows/* to the Dispatcher would avoid adding /whatever in the future?

That would allow you to do whatever you want from /<context>/ows onwards, like /<context>/ows/<workspace>/wfs, etc.

It sounds good to me, the only catch being possible applications hardcoded to hit /<context>/wfs?

What do you think?

With this approach we would still be hardcoding the set of "root paths". For instance consider adding a new restful service. It has to go under "/rest" or "/api".

So it looks like I was not aware the virtual services affect rest too? do you need /<workspace>/rest the same you need /<workspace>/wfs?

No. Forget virtual services for the context of this problem :slight_smile: Apologies for confusing two issues. This problem is:

a) to allow services to plug in under the root context without modifying web.xml

b) to allow for more complex mappings, ie /*/ows/** rather than just /ows/*

Anyhow, if I just misunderstood it forget about that.

Cheers,
Gabriel

While this might not seem such bad restriction to

some, to others it would seem strange. Some of the people doing javascript work that have to consume a geoserver restful service have commented on this before.

Another example would be someone who wanted to experiment with a new sort of web front end. They would have to create a mapping in web.xml that is there only for the sole purpose of the extension.

2c.-

Gabriel

--
Justin Deoliveira
OpenGeo - http://opengeo.org
Enterprise support for open source geospatial.

Gabriel Roldan wrote:

The thing is, mapping /*/wfs/* might also mean trouble if any other resource has a "wfs" step?

This could be a potential problem yes. But since all the matching is done with ant style patterns (which are quite flexible) we should be able to get around any conflicts.

Gabriel Roldan wrote:

So you're chasing for a general solution to avoid servlet mappings in web.xml? I thought it was just about the virtual services being hard to map as conceived because it would force the /* mapping to ows.

So I meant you can map /ows/* to the dispatcher and from that on introduce the virtual services

I am not quite sure I understand what you are getting at. I agree that the "wms", "wfs", and "wcs" mappings can be considered as aliases for "ows". But that seems sort of orthogonal to the problem. Even if we disallowed them we would still have the same problem just one level up not being able to add services, reslets, some other applications directly under the root context.

hmm I don't see why mapping /ows/* to the Dispatcher would avoid adding /whatever in the future?

That would allow you to do whatever you want from /<context>/ows onwards, like /<context>/ows/<workspace>/wfs, etc.

It sounds good to me, the only catch being possible applications hardcoded to hit /<context>/wfs?

What do you think?

With this approach we would still be hardcoding the set of "root paths". For instance consider adding a new restful service. It has to go under "/rest" or "/api".

So it looks like I was not aware the virtual services affect rest too? do you need /<workspace>/rest the same you need /<workspace>/wfs?

Anyhow, if I just misunderstood it forget about that.

Cheers,
Gabriel

While this might not seem such bad restriction to

some, to others it would seem strange. Some of the people doing javascript work that have to consume a geoserver restful service have commented on this before.

Another example would be someone who wanted to experiment with a new sort of web front end. They would have to create a mapping in web.xml that is there only for the sole purpose of the extension.

2c.-

Gabriel

--
Justin Deoliveira
OpenGeo - http://opengeo.org
Enterprise support for open source geospatial.

I think the question of overlapping paths is a good one. If we allow administrators to bind stuff to arbitrary path prefixes, what happens when I have an experimental RESTful WMS service at /rest/wms/ and some user decides to create his map of all the rest stops in Kentucky at /rest/wms/ ? Either we have a situation where the user can create a configuration that silently replaces one service with another, or we have to provide some validation on service names that is aware of the path patterns and how they interact. Neither option seems very appealing to me.

So, personally, I am in favor of keeping user-defined path segments separate from developer-defined ones. (ie, /some-specific-path/ and /some-specific-path/hard-coded-magic-id should very rarely be part of the same service.) Prefixes for different classes of service (ows, rest plugins, web pages) seem like a pretty manageable way to maintain that.

Just my 2 cents.

Sorry, not sure I 100% follow. So I think you are saying that like the way it is today with a pre defined closed set of top level mappings?

David Winslow wrote:

On 12/17/2009 11:49 AM, Justin Deoliveira wrote:

No. Forget virtual services for the context of this problem :slight_smile: Apologies for confusing two issues. This problem is:

a) to allow services to plug in under the root context without modifying web.xml

b) to allow for more complex mappings, ie /*/ows/** rather than just /ows/*

I think the question of overlapping paths is a good one. If we allow administrators to bind stuff to arbitrary path prefixes, what happens when I have an experimental RESTful WMS service at /rest/wms/ and some user decides to create his map of all the rest stops in Kentucky at /rest/wms/ ? Either we have a situation where the user can create a configuration that silently replaces one service with another, or we have to provide some validation on service names that is aware of the path patterns and how they interact. Neither option seems very appealing to me.

So, personally, I am in favor of keeping user-defined path segments separate from developer-defined ones. (ie, /some-specific-path/<name from config> and /some-specific-path/hard-coded-magic-id should very rarely be part of the same service.) Prefixes for different classes of service (ows, rest plugins, web pages) seem like a pretty manageable way to maintain that.

Just my 2 cents.

--
David Winslow
OpenGeo - http://opengeo.org/

--
Justin Deoliveira
OpenGeo - http://opengeo.org
Enterprise support for open source geospatial.

No, I agree with you that it would be awesome to avoid requiring modifications to web.xml for extensions (and putting in a bunch of dead mappings that only kick in when certain extensions are present.)

But, I think we could be digging ourselves into a hole with freestyle mapping like what you are proposing. In particular, it gets hairy when there are wildcard expressions that overlap; in the example I brought up earlier there would be wildcards like:

/*/wms
/rest/*

The point that I was trying to get at was that (I think) we should avoid having wildcards/placeholders at the same level as predetermined text values in the route hierarchy. As a more concrete example, right now 'default' is a magic workspace name in the REST API, so when you make a request to
/rest/workspaces/foo you get the description for workspace foo, but when you request
/rest/workspaces/default you get something a little different. (I forget whether it is the representation of the workspace that happens to be default, or a special document that links to the default workspace.)

So we have to special-case 'default' there, and if an admin creates a workspace named "default" then he broke the REST API (or we have to add checks everywhere a workspace can be added/edited to avoid that). What I was trying to say is that we should try to avoid this sort of ambiguous path.

--
David Winslow
OpenGeo - http://opengeo.org/

On 12/17/2009 01:45 PM, Justin Deoliveira wrote:

Sorry, not sure I 100% follow. So I think you are saying that like the way it is today with a pre defined closed set of top level mappings?

David Winslow wrote:

On 12/17/2009 11:49 AM, Justin Deoliveira wrote:

No. Forget virtual services for the context of this problem :slight_smile: Apologies for confusing two issues. This problem is:

a) to allow services to plug in under the root context without modifying web.xml

b) to allow for more complex mappings, ie /*/ows/** rather than just /ows/*

I think the question of overlapping paths is a good one. If we allow administrators to bind stuff to arbitrary path prefixes, what happens when I have an experimental RESTful WMS service at /rest/wms/ and some user decides to create his map of all the rest stops in Kentucky at /rest/wms/ ? Either we have a situation where the user can create a configuration that silently replaces one service with another, or we have to provide some validation on service names that is aware of the path patterns and how they interact. Neither option seems very appealing to me.

So, personally, I am in favor of keeping user-defined path segments separate from developer-defined ones. (ie, /some-specific-path/<name from config> and /some-specific-path/hard-coded-magic-id should very rarely be part of the same service.) Prefixes for different classes of service (ows, rest plugins, web pages) seem like a pretty manageable way to maintain that.

Just my 2 cents.

--
David Winslow
OpenGeo - http://opengeo.org/

Cool, thanks for the clarification. And yeah I think you are right that it is just too slippery of a slope when you start having overlapping patterns like this.

David Winslow wrote:

No, I agree with you that it would be awesome to avoid requiring modifications to web.xml for extensions (and putting in a bunch of dead mappings that only kick in when certain extensions are present.)

But, I think we could be digging ourselves into a hole with freestyle mapping like what you are proposing. In particular, it gets hairy when there are wildcard expressions that overlap; in the example I brought up earlier there would be wildcards like:

/*/wms
/rest/*

The point that I was trying to get at was that (I think) we should avoid having wildcards/placeholders at the same level as predetermined text values in the route hierarchy. As a more concrete example, right now 'default' is a magic workspace name in the REST API, so when you make a request to
/rest/workspaces/foo you get the description for workspace foo, but when you request
/rest/workspaces/default you get something a little different. (I forget whether it is the representation of the workspace that happens to be default, or a special document that links to the default workspace.)

So we have to special-case 'default' there, and if an admin creates a workspace named "default" then he broke the REST API (or we have to add checks everywhere a workspace can be added/edited to avoid that). What I was trying to say is that we should try to avoid this sort of ambiguous path.

--
David Winslow
OpenGeo - http://opengeo.org/

On 12/17/2009 01:45 PM, Justin Deoliveira wrote:

Sorry, not sure I 100% follow. So I think you are saying that like the way it is today with a pre defined closed set of top level mappings?

David Winslow wrote:

On 12/17/2009 11:49 AM, Justin Deoliveira wrote:

No. Forget virtual services for the context of this problem :slight_smile: Apologies for confusing two issues. This problem is:

a) to allow services to plug in under the root context without modifying web.xml

b) to allow for more complex mappings, ie /*/ows/** rather than just /ows/*

I think the question of overlapping paths is a good one. If we allow administrators to bind stuff to arbitrary path prefixes, what happens when I have an experimental RESTful WMS service at /rest/wms/ and some user decides to create his map of all the rest stops in Kentucky at /rest/wms/ ? Either we have a situation where the user can create a configuration that silently replaces one service with another, or we have to provide some validation on service names that is aware of the path patterns and how they interact. Neither option seems very appealing to me.

So, personally, I am in favor of keeping user-defined path segments separate from developer-defined ones. (ie, /some-specific-path/<name from config> and /some-specific-path/hard-coded-magic-id should very rarely be part of the same service.) Prefixes for different classes of service (ows, rest plugins, web pages) seem like a pretty manageable way to maintain that.

Just my 2 cents.

--
David Winslow
OpenGeo - http://opengeo.org/

--
Justin Deoliveira
OpenGeo - http://opengeo.org
Enterprise support for open source geospatial.