[Geoserver-devel] URL construction callbacks

Hi,

A short introduction is probably in order as I just subscribed to the list and to bring my proposal into context. My name is Sampo Savolainen and I work for the Information Centre for the Ministry of Agriculture and Forestry in Finland. We have decided to use WFS and WMS as the protocols to deliver spatial features and map images to our apps. Geoserver was chosen to be the WFS server and we have been working together with OpenGeo to strengthen the integration between Geoserver and Oracle.

Because of the nature of our applications, we need to impose strict authentication and authorization rules on the WFS layers. We are using our own authentication and authorization backend and my job is to integrate this system to Geoserver. My first attempt was a very involved and overly complex ServletFilter which had to analyze the requests and filter responses according to the users authorization information. This approach works but is clunky, error prone and performs sub-optimally. I now started work on another approach based on DispatcherCallback and DataAccessManager mechanisms.

The basic flow is that the user authenticates out-of-band in relation to geoserver. After successful authentication, the user receives a token which he/she must present with each WFS request. To be compliant with standard WFS clients, I want this token to be presented as a GET parameter to all WFS services. My problem is that Geoserver does not know of this token when it describes its’ services: the onlineResource fields in getCapabilities response messages, the schema locations in the responses to getFeature etc. etc. This can of course be handled with a ServletFilter which recognizes URLs in the responses and transforms them accordingly. However it is not an optimal solution.

What I’m suggesting is a callback system which is called whenever Geoserver creates URLs. Like DataAccessManager is called to check whether the current user can access a certain resource, this URLConstructionCallback would be called whenever a URL is constructed. It can then decide to modify the URL if necessary.

I think this addition could prove useful not only to me, but it could be useful on a more generic level. In fact, Geoserver already has a need for such a system: “proxified urls”. Currently there’s a utility method to proxify URLs: org.geoserver.ows.util.RequestUtils.proxifiedBaseURL(). This method seems to be called everywhere a URL pointing back to geoserver is created. I think this mechanism would be much better suited to a callback system. There’s also a ReverseProxyFilter which I’m not that familiar with but it seems like it’s doing the same thing. This callback system could remove the need for the filter altogether if I’ve understood its’ purpose correctly.

A rough sketch of the proposed API:

public interface URLConstructionCallback
{
public enum URLType {
EXTERNAL, // The link points outside Geoserver
RESOURCE, // The link points to a static resource (image, ogc schema, etc.)
SERVICE // The link points to a dynamic service provided by Geoserver (WFS, WMS, WCS, etc.)
};

public void constructUrl(StringBuffer url, Map kvp, URLType type);
}

The constructUrl() callback method could modify the base url and/or the key-value-pair map. The URLType parameter and Request, Service and Response objects (which are available through ThreadLocals via static methods, right?) would provide information about the function and context of the URL in construction.

The kvp could be optional and we could allow GET parameters to be already present in the URL string. This could make the use of such API easier.

The utility method for creating URLs would iterate the URL over spring registered callback objects:

public String constructURL(String baseUrl, Map kvp, URLType type)
{
if (urlConstructionCallbacks == null || urlConstructionCallbacks.size() == 0) {
return appendGetParameters(url, kvp);
}

StringBuffer tmp = new StringBuffer(url);

for (URLConstructionCallback callback : urlConstructionCallbacks) {
constructUrl(tmp, kvp, type);
}

return appendGetParameters(tmp.toString(), kvp);
}

// helper methods could be created to shield using classes from importing the URLType enum.
public String constructExternalURL(String baseUrl, Map kvp)
{
return constructURL(baseUrl, kvp, URLType.EXTERNAL);
}

Comments? Improvements to the API?

Thanks for reading,
Sampo Savolainen

Savolainen Sampo ha scritto:

Hi,
A short introduction is probably in order as I just subscribed to the list and to bring my proposal into context. My name is Sampo Savolainen and I work for the Information Centre for the Ministry of Agriculture and Forestry in Finland. We have decided to use WFS and WMS as the protocols to deliver spatial features and map images to our apps. Geoserver was chosen to be the WFS server and we have been working together with OpenGeo to strengthen the integration between Geoserver and Oracle.

Yep. I take the occasion to thank you for sponsoring the Oracle store
improvements as well as GeoServer changes needed to make it feasible
to work with big amounts of data in Oracle (native maxFeaure -> limit
support, being able to work without native bbox just to name a couple).

Because of the nature of our applications, we need to impose strict authentication and authorization rules on the WFS layers. We are using our own authentication and authorization backend and my job is to integrate this system to Geoserver. My first attempt was a very involved and overly complex ServletFilter which had to analyze the requests and filter responses according to the users authorization information. This approach works but is clunky, error prone and performs sub-optimally. I now started work on another approach based on DispatcherCallback and DataAccessManager mechanisms.
The basic flow is that the user authenticates out-of-band in relation to geoserver. After successful authentication, the user receives a token which he/she must present with each WFS request. To be compliant with standard WFS clients, I want this token to be presented as a GET parameter to all WFS services. My problem is that Geoserver does not know of this token when it describes its' services: the onlineResource fields in getCapabilities response messages, the schema locations in the responses to getFeature etc. etc. This can of course be handled with a ServletFilter which recognizes URLs in the responses and transforms them accordingly. However it is not an optimal solution.

The issue is that it's hard to perform streaming filtering and changing
of those back links, right? I think in a normal situation you end up
caching in memory a DOM.
As I suggested off line, I think this transformation can be done
via XSLT. It seems that Stax can be coupled with XSLT to make the
transformation streaming, thought I'm not sure if it can be used
with the xml libraries we bundle with GeoServer (e.g., specific versions
of Xalan and Xerces).

What I'm suggesting is a callback system which is called whenever Geoserver creates URLs. Like DataAccessManager is called to check whether the current user can access a certain resource, this URLConstructionCallback would be called whenever a URL is constructed. It can then decide to modify the URL if necessary.
I think this addition could prove useful not only to me, but it could be useful on a more generic level. In fact, Geoserver already has a need for such a system: "proxified urls". Currently there's a utility method to proxify URLs: org.geoserver.ows.util.RequestUtils.proxifiedBaseURL(). This method seems to be called everywhere a URL pointing back to geoserver is created. I think this mechanism would be much better suited to a callback system.

Having people remember to use the callbacks every time seems a little
unlikely, but we can make a single proxifier method that looks for one or more callbacks in the spring context. Something like:

RequestUtils.buildURL(String baseURL, String path, Map kvp, URLType type) -> String

which internally would first proxify the requests, and then call in
order whatever url contruction callback is there.

There's also a ReverseProxyFilter which I'm not that familiar with but it seems like it's doing the same thing. This callback system could remove the need for the filter altogether if I've understood its' purpose correctly.

ReverseProxyFilter is there to handle HTML transformation.
The theory behind it (and the fact it's disable by default) is that if
you have a HTML aware proxy GeoServer should not do anything and let
the proxy do its work (a sample of such thing is thelibapache2-mod-proxy-html that is shipping with recent distributions).
XML backlinks explicit handling in code is there because we're
not aware of any XML aware proxy that would change the xml contents
on the fly, so we do it "built in".

A rough sketch of the proposed API:
public interface URLConstructionCallback
{
    public enum URLType {
        EXTERNAL, // The link points outside Geoserver
        RESOURCE, // The link points to a static resource (image, ogc schema, etc.)
        SERVICE // The link points to a dynamic service provided by Geoserver (WFS, WMS, WCS, etc.)
    };
     public void constructUrl(StringBuffer url, Map kvp, URLType type);
}
The constructUrl() callback method could modify the base url and/or the key-value-pair map. The URLType parameter and Request, Service and Response objects (which are available through ThreadLocals via static methods, right?) would provide information about the function and context of the URL in construction.

Yeah, in your case if I understood it right you would add a KVP
parameter to the URL that is used to specify the auth token.
The dispatcher callback would put the param into a thread local that
would be then used by the url construction callback to modify the
URL.

The kvp could be optional and we could allow GET parameters to be already present in the URL string. This could make the use of such API easier.

public String constructURL(String baseUrl, Map kvp, URLType type)
{
    if (urlConstructionCallbacks == null || urlConstructionCallbacks.size() == 0) {
        return appendGetParameters(url, kvp);
    }
     StringBuffer tmp = new StringBuffer(url);
     for (URLConstructionCallback callback : urlConstructionCallbacks) {
        constructUrl(tmp, kvp, type);
    }
     return appendGetParameters(tmp.toString(), kvp);
}
// helper methods could be created to shield using classes from importing the URLType enum.
public String constructExternalURL(String baseUrl, Map kvp)
{
    return constructURL(baseUrl, kvp, URLType.EXTERNAL);
}

Ok, so the callback is actually allowed to change both the
url stringbuffer and the kvp map contents?
For proxy handling I guess we want to separate the
baseUrl form the path. For example, in a typical backlink you'd have:
base url: http://host:port:8080/geoserver
path: /wms

Thinking about it a bit more, I'm not sure it's really necessary to
pass around the base url explicitly... now that we have a thread
local representing the request we can access the base url at any
time from any piece of code.
Even the code that is not changing the URL to handle the proxies
could just be implemented as a url callback (one that changes the
url instead of the parameters).

Sounds good enough to me... what do others think?
Cheers
Andrea

--
Andrea Aime
OpenGeo - http://opengeo.org
Expert service straight from the developers.

Hi All,

so, if I got it right, you want a to achieve configurable means to append to the GeoServer generated URL´s, and the callback mechanism you proposed seems really appropriate to me.

Still, I feel like this url callback mechanism to modify the CGI like parameters for the generated urls, and the reverse proxy servlet filter keep on being orthogonal issues.
That is, provided any programatically generated URL is built using the callbacks, there will still be URLs that won´t be catched up on HTML content, because they´re not programatically generated by statically. For example, links to getcapabilities from javascript files, demos, etc are usually static and relative. Moreover, the demos are gonna be pluggable and hence references to service end points are generally static. This plugability may also end up being a way for deployers to supply custom applications with geoserver.

So, I don´t see this as a big concern really. I just think the callback mechanism is really appropriate _and_ should be used _also_ by the html url rewriting filter in order to catch up on any missing URL, instead of replacing it.

The only open question then would be what happens when there is at least one url construction callback configured _and_ the html reverse proxy filter is disabled (like in the setup avoids it in favor of, say, apache´s mod_proxy_html). May be the answer is simply that the filter should engage anyway whenever there´s a callback configured in the app context?

My 2c.-

Gabriel

Andrea Aime wrote:

Savolainen Sampo ha scritto:

Hi,
A short introduction is probably in order as I just subscribed to the list and to bring my proposal into context. My name is Sampo Savolainen and I work for the Information Centre for the Ministry of Agriculture and Forestry in Finland. We have decided to use WFS and WMS as the protocols to deliver spatial features and map images to our apps. Geoserver was chosen to be the WFS server and we have been working together with OpenGeo to strengthen the integration between Geoserver and Oracle.

Yep. I take the occasion to thank you for sponsoring the Oracle store
improvements as well as GeoServer changes needed to make it feasible
to work with big amounts of data in Oracle (native maxFeaure -> limit
support, being able to work without native bbox just to name a couple).

Because of the nature of our applications, we need to impose strict authentication and authorization rules on the WFS layers. We are using our own authentication and authorization backend and my job is to integrate this system to Geoserver. My first attempt was a very involved and overly complex ServletFilter which had to analyze the requests and filter responses according to the users authorization information. This approach works but is clunky, error prone and performs sub-optimally. I now started work on another approach based on DispatcherCallback and DataAccessManager mechanisms.
The basic flow is that the user authenticates out-of-band in relation to geoserver. After successful authentication, the user receives a token which he/she must present with each WFS request. To be compliant with standard WFS clients, I want this token to be presented as a GET parameter to all WFS services. My problem is that Geoserver does not know of this token when it describes its' services: the onlineResource fields in getCapabilities response messages, the schema locations in the responses to getFeature etc. etc. This can of course be handled with a ServletFilter which recognizes URLs in the responses and transforms them accordingly. However it is not an optimal solution.

The issue is that it's hard to perform streaming filtering and changing
of those back links, right? I think in a normal situation you end up
caching in memory a DOM.
As I suggested off line, I think this transformation can be done
via XSLT. It seems that Stax can be coupled with XSLT to make the
transformation streaming, thought I'm not sure if it can be used
with the xml libraries we bundle with GeoServer (e.g., specific versions
of Xalan and Xerces).

What I'm suggesting is a callback system which is called whenever Geoserver creates URLs. Like DataAccessManager is called to check whether the current user can access a certain resource, this URLConstructionCallback would be called whenever a URL is constructed. It can then decide to modify the URL if necessary.
I think this addition could prove useful not only to me, but it could be useful on a more generic level. In fact, Geoserver already has a need for such a system: "proxified urls". Currently there's a utility method to proxify URLs: org.geoserver.ows.util.RequestUtils.proxifiedBaseURL(). This method seems to be called everywhere a URL pointing back to geoserver is created. I think this mechanism would be much better suited to a callback system.

Having people remember to use the callbacks every time seems a little
unlikely, but we can make a single proxifier method that looks for one or more callbacks in the spring context. Something like:

RequestUtils.buildURL(String baseURL, String path, Map kvp, URLType type) -> String

which internally would first proxify the requests, and then call in
order whatever url contruction callback is there.

There's also a ReverseProxyFilter which I'm not that familiar with but it seems like it's doing the same thing. This callback system could remove the need for the filter altogether if I've understood its' purpose correctly.

ReverseProxyFilter is there to handle HTML transformation.
The theory behind it (and the fact it's disable by default) is that if
you have a HTML aware proxy GeoServer should not do anything and let
the proxy do its work (a sample of such thing is thelibapache2-mod-proxy-html that is shipping with recent distributions).
XML backlinks explicit handling in code is there because we're
not aware of any XML aware proxy that would change the xml contents
on the fly, so we do it "built in".

A rough sketch of the proposed API:
public interface URLConstructionCallback
{
    public enum URLType {
        EXTERNAL, // The link points outside Geoserver
        RESOURCE, // The link points to a static resource (image, ogc schema, etc.)
        SERVICE // The link points to a dynamic service provided by Geoserver (WFS, WMS, WCS, etc.)
    };
     public void constructUrl(StringBuffer url, Map kvp, URLType type);
}
The constructUrl() callback method could modify the base url and/or the key-value-pair map. The URLType parameter and Request, Service and Response objects (which are available through ThreadLocals via static methods, right?) would provide information about the function and context of the URL in construction.

Yeah, in your case if I understood it right you would add a KVP
parameter to the URL that is used to specify the auth token.
The dispatcher callback would put the param into a thread local that
would be then used by the url construction callback to modify the
URL.

The kvp could be optional and we could allow GET parameters to be already present in the URL string. This could make the use of such API easier.

public String constructURL(String baseUrl, Map kvp, URLType type)
{
    if (urlConstructionCallbacks == null || urlConstructionCallbacks.size() == 0) {
        return appendGetParameters(url, kvp);
    }
     StringBuffer tmp = new StringBuffer(url);
     for (URLConstructionCallback callback : urlConstructionCallbacks) {
        constructUrl(tmp, kvp, type);
    }
     return appendGetParameters(tmp.toString(), kvp);
}
// helper methods could be created to shield using classes from importing the URLType enum.
public String constructExternalURL(String baseUrl, Map kvp)
{
    return constructURL(baseUrl, kvp, URLType.EXTERNAL);
}

Ok, so the callback is actually allowed to change both the
url stringbuffer and the kvp map contents?
For proxy handling I guess we want to separate the
baseUrl form the path. For example, in a typical backlink you'd have:
base url: http://host:port:8080/geoserver
path: /wms

Thinking about it a bit more, I'm not sure it's really necessary to
pass around the base url explicitly... now that we have a thread
local representing the request we can access the base url at any
time from any piece of code.
Even the code that is not changing the URL to handle the proxies
could just be implemented as a url callback (one that changes the
url instead of the parameters).

Sounds good enough to me... what do others think?
Cheers
Andrea

--
Gabriel Roldan
OpenGeo - http://opengeo.org
Expert service straight from the developers.

-----Alkuperäinen viesti-----
Lähettäjä: Gabriel Roldan [mailto:groldan@anonymised.com]
Lähetetty: 24. elokuuta 2009 17:16
Vastaanottaja: Andrea Aime
Kopio: Savolainen Sampo; geoserver-devel@lists.sourceforge.net
Aihe: Re: [Geoserver-devel] URL construction callbacks

Hi All,

so, if I got it right, you want a to achieve configurable
means to append to the GeoServer generated URL´s, and the
callback mechanism you proposed seems really appropriate to me.

Yes. Configurable in the sense that one can add code which can relatively freely mangle said URLs.

Still, I feel like this url callback mechanism to modify the
CGI like parameters for the generated urls, and the reverse
proxy servlet filter keep on being orthogonal issues.

I have misunderstood the function of the reverse proxy filter. You can safely ignore my uninformed ramblings about the filter. :slight_smile:

So, I don´t see this as a big concern really. I just think
the callback mechanism is really appropriate _and_ should be
used _also_ by the html url rewriting filter in order to
catch up on any missing URL, instead of replacing it.

The only open question then would be what happens when there
is at least one url construction callback configured _and_
the html reverse proxy filter is disabled (like in the setup
avoids it in favor of, say, apache´s mod_proxy_html). May be
the answer is simply that the filter should engage anyway
whenever there´s a callback configured in the app context?

Unfortunately I do not understand enough of the internals of Geoserver to really comment on this. I can only expect that the optimum result would be that the default configuration of Geoserver would work the same as currently, with the only difference being a "feature transfer" to a set of default URLConstructionCallback imlementations. And of course the added ability of attaching more URLConstructionCallbacks.

Any comments on the API? The biggest question is whether having a kvp Map is overkill or not. And if not, should it still be possible to have GET parameters appended into the URL without them being in the kvp?

Sampo

Any comments on the API? The biggest question is whether having a kvp
Map is overkill or not. And if not, should it still be possible to
have GET parameters appended into the URL without them being in the
kvp?

question: what is your intended use for URLType? I see you may want to append or not your auth token based on whether the url points outside the server or not... still, I don't think there're programmaticaly created URLs pointing outside the server? (at least none that I can think of off the top of my head)

API wise, in principle the approach is correct. Now that we settled down the separation of concerns between service response urls (usually schema locations, either static or dynamically generated like DescribeFeatureType), the fact is that all URL constructions in GeoServer go through the RequestUtils.baseUrl(HttpServletRequest) method, so it seems that's where to inject the extension point?

I've cooked the following patch which hopefully may simplify a bit the API and still provide enough context as to create your callback: <http://pastebin.com/m24ed06a0&gt;

Please tell me if it seems appropriate or am I missing something.

Best regards,

Gabriel

Sampo

--
Gabriel Roldan
OpenGeo - http://opengeo.org
Expert service straight from the developers.

Gabriel Roldan ha scritto:

Any comments on the API? The biggest question is whether having a kvp
Map is overkill or not. And if not, should it still be possible to
have GET parameters appended into the URL without them being in the
kvp?

question: what is your intended use for URLType? I see you may want to append or not your auth token based on whether the url points outside the server or not... still, I don't think there're programmaticaly created URLs pointing outside the server? (at least none that I can think of off the top of my head)

Right, at the moment we don't, but in theory we could, or new
services might have a need to. With REST extensions and the work
David is doing on integrating Javascript and the interest in other
scripting languages I think it's fair to assume we'll see other
people needing to generate links in their responses: we cannot
exclude those will never point outwards.

API wise, in principle the approach is correct. Now that we settled down the separation of concerns between service response urls (usually schema locations, either static or dynamically generated like DescribeFeatureType), the fact is that all URL constructions in GeoServer go through the RequestUtils.baseUrl(HttpServletRequest) method, so it seems that's where to inject the extension point?

I've cooked the following patch which hopefully may simplify a bit the API and still provide enough context as to create your callback: <http://pastebin.com/m24ed06a0&gt;

Please tell me if it seems appropriate or am I missing something.

The spirit of the above is right, but the implementation as is won't
work. Sampo needs to add a kvp parameter to the request, the existing
code uses the proxifier just to change the base url and then
adds a extension path and kvp params.
That needs to be changed so that the code needing to build a URL
passes the base path (http://host:port/geoserver), the extension
to the base (/wms), whatever kvp parameter (request=GetMap) so that
the utility method building the final url can:
- change the base path if needed
- change the URL params if needed
- assemble a final URL

The changes above would be done by calling the URL callbacks (the one that changes the base path would be just another callback).

What needs doing is to add the above API and then change the call
points that build a URL so that they use the new API (there are
around 40 calling points in the code base).

Makes sense?

Cheers
Andrea

--
Andrea Aime
OpenGeo - http://opengeo.org
Expert service straight from the developers.

To chime in with my thoughts on this one:

I like the idea a lot. How we handle proxying urls is something i have never been that happy with. So if I understand correctly (and to sum up) what this is entails is:

1) coming up with the "mangle" method, RequestUtils.buildURL() or constructURL(). I actually prefer the name "postMangleURL()" as it seems more descriptive to what is actually occurring.

2) Coming up with an extension point (URLConstructionCallback). Again for naming purposes something like "URLMangler" seems more descriptive.

3) Every call to "proxifyBaseURL()" gets replaced with "buildURL()" (or whatever it gets called).

4) The impl of "buildURL()" does a extension lookup and calls every url callback, and also explicitly invokes the Proxifying url callback.

As i understand it, this sounds good to me.

Open question is does this require a GSIP?

-Justin

Andrea Aime wrote:

Savolainen Sampo ha scritto:

Hi,
A short introduction is probably in order as I just subscribed to the list and to bring my proposal into context. My name is Sampo Savolainen and I work for the Information Centre for the Ministry of Agriculture and Forestry in Finland. We have decided to use WFS and WMS as the protocols to deliver spatial features and map images to our apps. Geoserver was chosen to be the WFS server and we have been working together with OpenGeo to strengthen the integration between Geoserver and Oracle.

Yep. I take the occasion to thank you for sponsoring the Oracle store
improvements as well as GeoServer changes needed to make it feasible
to work with big amounts of data in Oracle (native maxFeaure -> limit
support, being able to work without native bbox just to name a couple).

Because of the nature of our applications, we need to impose strict authentication and authorization rules on the WFS layers. We are using our own authentication and authorization backend and my job is to integrate this system to Geoserver. My first attempt was a very involved and overly complex ServletFilter which had to analyze the requests and filter responses according to the users authorization information. This approach works but is clunky, error prone and performs sub-optimally. I now started work on another approach based on DispatcherCallback and DataAccessManager mechanisms.
The basic flow is that the user authenticates out-of-band in relation to geoserver. After successful authentication, the user receives a token which he/she must present with each WFS request. To be compliant with standard WFS clients, I want this token to be presented as a GET parameter to all WFS services. My problem is that Geoserver does not know of this token when it describes its' services: the onlineResource fields in getCapabilities response messages, the schema locations in the responses to getFeature etc. etc. This can of course be handled with a ServletFilter which recognizes URLs in the responses and transforms them accordingly. However it is not an optimal solution.

The issue is that it's hard to perform streaming filtering and changing
of those back links, right? I think in a normal situation you end up
caching in memory a DOM.
As I suggested off line, I think this transformation can be done
via XSLT. It seems that Stax can be coupled with XSLT to make the
transformation streaming, thought I'm not sure if it can be used
with the xml libraries we bundle with GeoServer (e.g., specific versions
of Xalan and Xerces).

What I'm suggesting is a callback system which is called whenever Geoserver creates URLs. Like DataAccessManager is called to check whether the current user can access a certain resource, this URLConstructionCallback would be called whenever a URL is constructed. It can then decide to modify the URL if necessary.
I think this addition could prove useful not only to me, but it could be useful on a more generic level. In fact, Geoserver already has a need for such a system: "proxified urls". Currently there's a utility method to proxify URLs: org.geoserver.ows.util.RequestUtils.proxifiedBaseURL(). This method seems to be called everywhere a URL pointing back to geoserver is created. I think this mechanism would be much better suited to a callback system.

Having people remember to use the callbacks every time seems a little
unlikely, but we can make a single proxifier method that looks for one or more callbacks in the spring context. Something like:

RequestUtils.buildURL(String baseURL, String path, Map kvp, URLType type) -> String

which internally would first proxify the requests, and then call in
order whatever url contruction callback is there.

There's also a ReverseProxyFilter which I'm not that familiar with but it seems like it's doing the same thing. This callback system could remove the need for the filter altogether if I've understood its' purpose correctly.

ReverseProxyFilter is there to handle HTML transformation.
The theory behind it (and the fact it's disable by default) is that if
you have a HTML aware proxy GeoServer should not do anything and let
the proxy do its work (a sample of such thing is thelibapache2-mod-proxy-html that is shipping with recent distributions).
XML backlinks explicit handling in code is there because we're
not aware of any XML aware proxy that would change the xml contents
on the fly, so we do it "built in".

A rough sketch of the proposed API:
public interface URLConstructionCallback
{
    public enum URLType {
        EXTERNAL, // The link points outside Geoserver
        RESOURCE, // The link points to a static resource (image, ogc schema, etc.)
        SERVICE // The link points to a dynamic service provided by Geoserver (WFS, WMS, WCS, etc.)
    };
     public void constructUrl(StringBuffer url, Map kvp, URLType type);
}
The constructUrl() callback method could modify the base url and/or the key-value-pair map. The URLType parameter and Request, Service and Response objects (which are available through ThreadLocals via static methods, right?) would provide information about the function and context of the URL in construction.

Yeah, in your case if I understood it right you would add a KVP
parameter to the URL that is used to specify the auth token.
The dispatcher callback would put the param into a thread local that
would be then used by the url construction callback to modify the
URL.

The kvp could be optional and we could allow GET parameters to be already present in the URL string. This could make the use of such API easier.

public String constructURL(String baseUrl, Map kvp, URLType type)
{
    if (urlConstructionCallbacks == null || urlConstructionCallbacks.size() == 0) {
        return appendGetParameters(url, kvp);
    }
     StringBuffer tmp = new StringBuffer(url);
     for (URLConstructionCallback callback : urlConstructionCallbacks) {
        constructUrl(tmp, kvp, type);
    }
     return appendGetParameters(tmp.toString(), kvp);
}
// helper methods could be created to shield using classes from importing the URLType enum.
public String constructExternalURL(String baseUrl, Map kvp)
{
    return constructURL(baseUrl, kvp, URLType.EXTERNAL);
}

Ok, so the callback is actually allowed to change both the
url stringbuffer and the kvp map contents?
For proxy handling I guess we want to separate the
baseUrl form the path. For example, in a typical backlink you'd have:
base url: http://host:port:8080/geoserver
path: /wms

Thinking about it a bit more, I'm not sure it's really necessary to
pass around the base url explicitly... now that we have a thread
local representing the request we can access the base url at any
time from any piece of code.
Even the code that is not changing the URL to handle the proxies
could just be implemented as a url callback (one that changes the
url instead of the parameters).

Sounds good enough to me... what do others think?
Cheers
Andrea

--
Justin Deoliveira
OpenGeo - http://opengeo.org
Enterprise support for open source geospatial.

Justin Deoliveira ha scritto:

To chime in with my thoughts on this one:

I like the idea a lot. How we handle proxying urls is something i have never been that happy with. So if I understand correctly (and to sum up) what this is entails is:

1) coming up with the "mangle" method, RequestUtils.buildURL() or constructURL(). I actually prefer the name "postMangleURL()" as it seems more descriptive to what is actually occurring.

2) Coming up with an extension point (URLConstructionCallback). Again for naming purposes something like "URLMangler" seems more descriptive.

3) Every call to "proxifyBaseURL()" gets replaced with "buildURL()" (or whatever it gets called).

Roger on the names

4) The impl of "buildURL()" does a extension lookup and calls every url callback, and also explicitly invokes the Proxifying url callback.

I think the url proxification can be made as a plugin (just one that's
always there, part of core)

As i understand it, this sounds good to me.

Open question is does this require a GSIP?

Good question. The change is not big, but affects every single module
in the code base (~40 call points). So yeah, I'll write one.

Cheers
Andrea

--
Andrea Aime
OpenGeo - http://opengeo.org
Expert service straight from the developers.

Sounds good to me. I don't see any drawback.

Cheers,
Gabriel
Andrea Aime wrote:

Gabriel Roldan ha scritto:

Any comments on the API? The biggest question is whether having a kvp
Map is overkill or not. And if not, should it still be possible to
have GET parameters appended into the URL without them being in the
kvp?

question: what is your intended use for URLType? I see you may want to append or not your auth token based on whether the url points outside the server or not... still, I don't think there're programmaticaly created URLs pointing outside the server? (at least none that I can think of off the top of my head)

Right, at the moment we don't, but in theory we could, or new
services might have a need to. With REST extensions and the work
David is doing on integrating Javascript and the interest in other
scripting languages I think it's fair to assume we'll see other
people needing to generate links in their responses: we cannot
exclude those will never point outwards.

API wise, in principle the approach is correct. Now that we settled down the separation of concerns between service response urls (usually schema locations, either static or dynamically generated like DescribeFeatureType), the fact is that all URL constructions in GeoServer go through the RequestUtils.baseUrl(HttpServletRequest) method, so it seems that's where to inject the extension point?

I've cooked the following patch which hopefully may simplify a bit the API and still provide enough context as to create your callback: <http://pastebin.com/m24ed06a0&gt;

Please tell me if it seems appropriate or am I missing something.

The spirit of the above is right, but the implementation as is won't
work. Sampo needs to add a kvp parameter to the request, the existing
code uses the proxifier just to change the base url and then
adds a extension path and kvp params.
That needs to be changed so that the code needing to build a URL
passes the base path (http://host:port/geoserver), the extension
to the base (/wms), whatever kvp parameter (request=GetMap) so that
the utility method building the final url can:
- change the base path if needed
- change the URL params if needed
- assemble a final URL

The changes above would be done by calling the URL callbacks (the one that changes the base path would be just another callback).

What needs doing is to add the above API and then change the call
points that build a URL so that they use the new API (there are
around 40 calling points in the code base).

Makes sense?

Cheers
Andrea

--
Gabriel Roldan
OpenGeo - http://opengeo.org
Expert service straight from the developers.

replying to this thread as it's the one designated for feedback.

First, the minimalist and naive one: use StringBuilder instead of StringBuffer for a little extra speed :slight_smile:

Then the one I've been thinking about:
stating that "The current "proxification" mechanism will be a a URLMangler changing only the baseURL." may be oversimplistic. That will only work for URL's programatically created, but the current proxification deals with those that are not.

What ReverseProxyFilter does can basically be reduced to a two-step process, for each line of filtered content (html, css, javascript, etc):

translatedLine = translatedLine.replaceAll(serverBase, proxyBase);
translatedLine = translatedLine.replaceAll(context, proxyContext);

It is not like for each URL found it's calling RequestUtils, and I think doing so to call RequestUtils.buildURL _may_ be overkiller. More over, I'm not sure it'll work since much of them lack base/context and are only <path>?<params>

What I've been trying to figure out is how much of ReverseProxyFilter's functionality could be abstracted out to this URLMangler.

The reverse proxy does only care about base and context. Currently it has effect only over two aspects of URLs: base (protocol, server, port) and context (like in servlet context for the web app).

With this new callbacks it'll need to take care also of path and params, so it can change "../wms?request=GetCapabilities? into "../wms?request=GetCapabilities&myCustomToken=something"

Calling RequestUtils.buildURL for each and every URL found in html/js/css resources may require a smarter, more structured (and slow) parsing of the content in order to actually extract the URL's instead of a simple String.replaceAll... So I guess we'll end up calling RequestUtils.buildURL(GeoServer.getProxyBaseUrl()) as it's being done right now and then extend the servlet filter so it appends also any extra CGI parameters added by the callbacks?

Aside, the EXTERNAL, RESOURCE, SERVICE "enum" might be augmented with an UNKNOWN? (like in when "reverse-proxifying" I don't know and don't really care?)

My 2c.-

Gabriel

Gabriel Roldan ha scritto:

replying to this thread as it's the one designated for feedback.

First, the minimalist and naive one: use StringBuilder instead of StringBuffer for a little extra speed :slight_smile:

Well, I really want to see if this is any measurable in our requests :wink:

Then the one I've been thinking about:
stating that "The current "proxification" mechanism will be a a URLMangler changing only the baseURL." may be oversimplistic. That will only work for URL's programatically created, but the current proxification deals with those that are not.

The GSIP is only about those programmatically build, for the
responses. I did not envison any relationship with the proxy filter.
Not sure where you got the impression is what otherwise.

I think Sampo does not really need the UI to be handled,
he's interested in keeping standard WMS/WFS clients working
in face of their custom authentication.

What ReverseProxyFilter does can basically be reduced to a two-step process, for each line of filtered content (html, css, javascript, etc):

translatedLine = translatedLine.replaceAll(serverBase, proxyBase);
translatedLine = translatedLine.replaceAll(context, proxyContext);

It is not like for each URL found it's calling RequestUtils, and I think doing so to call RequestUtils.buildURL _may_ be overkiller. More over, I'm not sure it'll work since much of them lack base/context and are only <path>?<params>

What I've been trying to figure out is how much of ReverseProxyFilter's functionality could be abstracted out to this URLMangler.

The reverse proxy does only care about base and context. Currently it has effect only over two aspects of URLs: base (protocol, server, port) and context (like in servlet context for the web app).

With this new callbacks it'll need to take care also of path and params, so it can change "../wms?request=GetCapabilities? into "../wms?request=GetCapabilities&myCustomToken=something"

Calling RequestUtils.buildURL for each and every URL found in html/js/css resources may require a smarter, more structured (and slow) parsing of the content in order to actually extract the URL's instead of a simple String.replaceAll... So I guess we'll end up calling RequestUtils.buildURL(GeoServer.getProxyBaseUrl()) as it's being done right now and then extend the servlet filter so it appends also any extra CGI parameters added by the callbacks?

Hmmm... actually the manglers can do what they please with all parts
of the URL (that's why there are the string buffers) and the KVP .
So if you want to fall back on them I think you have to make a full
parse of each URL.
Wondering, can't you get the base url by getting somewhere the host,
port and application base path? I think we did something like that
in 1.7.x somewhere.

Anyways, I'm not fully convinced the two systems should share any code.

Cheers
Andrea

--
Andrea Aime
OpenGeo - http://opengeo.org
Expert service straight from the developers.

I got the impression it matters from the "the current proxification..." statement. Given I was wrong and I don't want to slow you down, can you just add a jira for me to "make sure reverse proxy filter uses RequestUtil.buildURL properly" or alike?

Cheers,
Gabriel

Andrea Aime wrote:

Gabriel Roldan ha scritto:

replying to this thread as it's the one designated for feedback.

First, the minimalist and naive one: use StringBuilder instead of StringBuffer for a little extra speed :slight_smile:

Well, I really want to see if this is any measurable in our requests :wink:

Then the one I've been thinking about:
stating that "The current "proxification" mechanism will be a a URLMangler changing only the baseURL." may be oversimplistic. That will only work for URL's programatically created, but the current proxification deals with those that are not.

The GSIP is only about those programmatically build, for the
responses. I did not envison any relationship with the proxy filter.
Not sure where you got the impression is what otherwise.

I think Sampo does not really need the UI to be handled,
he's interested in keeping standard WMS/WFS clients working
in face of their custom authentication.

What ReverseProxyFilter does can basically be reduced to a two-step process, for each line of filtered content (html, css, javascript, etc):

translatedLine = translatedLine.replaceAll(serverBase, proxyBase);
translatedLine = translatedLine.replaceAll(context, proxyContext);

It is not like for each URL found it's calling RequestUtils, and I think doing so to call RequestUtils.buildURL _may_ be overkiller. More over, I'm not sure it'll work since much of them lack base/context and are only <path>?<params>

What I've been trying to figure out is how much of ReverseProxyFilter's functionality could be abstracted out to this URLMangler.

The reverse proxy does only care about base and context. Currently it has effect only over two aspects of URLs: base (protocol, server, port) and context (like in servlet context for the web app).

With this new callbacks it'll need to take care also of path and params, so it can change "../wms?request=GetCapabilities? into "../wms?request=GetCapabilities&myCustomToken=something"

Calling RequestUtils.buildURL for each and every URL found in html/js/css resources may require a smarter, more structured (and slow) parsing of the content in order to actually extract the URL's instead of a simple String.replaceAll... So I guess we'll end up calling RequestUtils.buildURL(GeoServer.getProxyBaseUrl()) as it's being done right now and then extend the servlet filter so it appends also any extra CGI parameters added by the callbacks?

Hmmm... actually the manglers can do what they please with all parts
of the URL (that's why there are the string buffers) and the KVP .
So if you want to fall back on them I think you have to make a full
parse of each URL.
Wondering, can't you get the base url by getting somewhere the host,
port and application base path? I think we did something like that
in 1.7.x somewhere.

Anyways, I'm not fully convinced the two systems should share any code.

Cheers
Andrea

--
Gabriel Roldan
OpenGeo - http://opengeo.org
Expert service straight from the developers.