Hi All,
As per previous correspondence from Julian Atkinson, we are in the
process of implementing our own WPS processes using GeoServer as
described in
http://docs.geoserver.org/latest/en/developer/programming-guide/wps-services/implementing.html.
In particular, we are implementing processes that will allow us to
generate collections of netcdf files for non-gridded data (one netcdf
file per CF feature instance at the moment) and to aggregate and subset
large collections of gridded netcdf files.
Depending on the amount of data selected, generation of output files may
be quite time consuming, so we have been using the asynchronous
processing option to run these processes (noting that netcdf is a
non-streaming format).
In addition to publishing these processes to collaborators to use to
access our data in netcdf format, we also make use of these processes
within our data portal to power download of data in netcdf format.
We'd like to be able to do so in as user friendly a way as possible
within the current constraints of the WPS protocol (1.0 at the moment).
We've raised one usability issue as we see it with the current GeoServer
WPS asynchronous support. In particular, the execution time limit
currently applied for asynchronous job execution currently includes the
time spent in the queue, this means that a few large jobs can prevent
other small jobs from ever being run which doesn't seem fair to the
users that submitted them. It would be good if we could work out a
fairer way of allocating processing resources. We've put up a pull
request with one option for addressing this at
https://github.com/geoserver/geoserver/pull/1188. In this pull request,
we have modified the execution time limit for jobs to exclude queuing
time and added a new total time limit - it seemed more intuitive to us
that the execution time limit would not include queuing time - which is
why we chose to modify the behaviour of this limit. We haven't received
any feedback on that PR so I'm wondering if others don't think that
makes sense and whether we should modify our approach here.
Another area where we'd like to be able to improve the user friendliness
of the process is to be able to give better feedback on how a job is
progressing. At the moment we can do that while the job is executing
using geoServers percentCompleted support, however, we can't do that
while the job is queued (which we've established can be as long as or if
we have our way, longer than a job can actually be executing). Looking
at the WPS 1.0 specification, the only way to do this as far as I can
tell, is to use the ProcessAccepted element. From the WPS 1.0
specification:
"... The contents of this human-readable text string is left open to
definition by each server, but is expected to include any messages the
server wishes to let the clients know. Such information could include
how long the queue is, or any warning conditions that may have been
encountered. The client may display this text to a human user."
Would it be worthwhile us looking at making use of the ProcessAccepted
element to return queue position information? If so, we could have a
look at how we may go about doing this and put up a more detailed
proposal or pull request for consideration. Any other options or
suggestions on how to go about this welcome.
Thanks,
Craig Jones
Integrated Marine Observing System
University of Tasmania Electronic Communications Policy (December, 2014).
This email is confidential, and is for the intended recipient only. Access, disclosure, copying, distribution, or reliance on any of it by anyone outside the intended recipient organisation is prohibited and may be a criminal offence. Please delete if obtained in error and email confirmation to the sender. The views expressed in this email are not necessarily the views of the University of Tasmania, unless clearly intended otherwise.