So I’ve been hearing a few different use cases around processing, and wanted to get a discussion going on the possibilities for some GeoServer improvements that could improve our story.
One is to be able to more fully define layers defined by processes on the fly. Right now we have the rendering transformations, which are totally awesome. And you can use that to render a layer on the fly. But if you want to query the resulting layer you have to recreate the exact wps request that’s embedded in the SLD. And if you want to change the visualization you have to also get comfortable with the full SLD. It seems like it’d be better if you could just define a layer that’s the result of a WPS process, coming from one or more layers. It’d be a layer that lives in the catalog, that can be queried with WMS or WFS/WCS, could have further WPS done on it, and could have a variety of SLD’s associated with it. It’d be listed in the capabilities document. But it’d always be constructed on the fly, running the backend WPS process on the layers that make it up. The advantage would be that it always stays up to date, and also doesn’t take up lots of room in a database. It’d be ideal for WPS processes that can be applied quickly, and/or that work on relatively small datasets. It’d be similar to a SQL View layer, but instead of being defined by SQL it’d be defined by a Process and its inputs (be they layers or set variables, or perhaps even parametric like sql views can be).
The other that jumps to mind is a related construct that would work for processes that may not work so well on the fly. Think like routing, where you need to load the whole graph in memory. Or even heatmaps could benefit, by doing a global heatmap on the whole layer, which would tile better. This would be sort of like a ‘cache’ of the process. I think it’d work similar to how the WPS process that outputs to the GeoServer catalog works. It’d output the process to a new layer, storing in the default database or raster format. But I think a key difference/improvement would be for it to be ‘live’ - have knowledge of changes to the base layer. So just like we kill the tile cache when an edit comes in over WFS-T or we’re notified on GeoRSS so too the derived WPS layer would rerun it’s whole process if there’s a change.
A related type of layer would be to just run the process on a schedule. So like if the notification stuff is not easy to set up, or the user knows they’ll want it run at certain times each day, they could define a derived WPS layer that re-runs the process at that set time.
Do others have other types of layers derived from WPS that could be useful? I think it’d be great to have a GUI that lets our users easily make all three type of layers. I haven’t reached any great insight on how that GUI would work - they could all be different types of a Derived Layer, with different config options. Or they could each be their own type of layer, which might make it clearer? Or maybe more confusing. I guess they maybe wouldn’t depend on a datastore? Or we could make a ‘virtual datastore’ for convenience. But each layer would just be defined by one or more other layers plus one or more processes.
I do think having this could make the WPS builder tool a lot more useful. Right now you make your query, but it then just returns you a big xml document or maybe a geotiff of what you made. It’d be great if instead you could build your query and then look at the results on openlayers. We could make that pretty easily now with a shortcut to the gs catalog import process. But with the above options you could have more control. The on the fly one in particular could be nice, as it could ease the creation of complex rendering transformations. Users could just define the transform, and then use a standard tool like uDig or GeoExplorer to define the styling. With these as options GeoServer becomes a place to explore spatial processing, instead of just building WPS. And could make it easier to make pure javascript clients that let people apply processes to full layers instead of just smaller datasets and getting the results in the browser.
Curious for people’s feedback, as I imagine others have been thinking on this too. And indeed just generally from everyone on what other improvements we could do on the WPS, since it’s been out there for a bit, and I’ve heard of some people doing some pretty cool stuff with it.
Chris