[Geoserver-devel] WPS process selection ready for commit, feedback welcomed

Hi,
I’ve been working on the process selection and I believe it’s ready for commit.
It’s a relatively large patch (97KB) with new API, even if it’s destined for trunk
only right now I believe it would have been good to get some feedback before
moving on.

If you just cannot resist and have a look at the diffs here you go, more description follows:
https://github.com/aaime/geoserver/compare/73f804450c859b2ab1e946baea22fd5153007c65…aaime:wps-filter

So the whole work is based on a very simple interface, here:
https://github.com/aaime/geoserver/blob/wps-filter/src/extension/wps/wps-core/src/main/java/org/geoserver/wps/process/ProcessFilter.java

It is simpler and more generic than the one I presented before, and allows also
to wrap process factories, which means one can not only control which processes are around, but
also lie about process metadata, or add controls for inputs and outputs (in their java object form,
more on this later).
If one really just wants to hide certain processes there is a convenience base class available:
https://github.com/aaime/geoserver/blob/wps-filter/src/extension/wps/wps-core/src/main/java/org/geoserver/wps/process/ProcessSelector.java

The ProcessFilter is pluggable, just register it in the app context (there are tests for this)
and the GeoServerProcessors, which replaces Processors in GeoServer WPS, will use it:
https://github.com/aaime/geoserver/blob/wps-filter/src/extension/wps/wps-core/src/main/java/org/geoserver/wps/process/GeoServerProcessors.java

Configuration wise, the WPSInfo has been extended to have an idea of a “process group”
(which is a factory, but I did not want to put the term “factory” in the GUI and REST user eyes):
https://github.com/aaime/geoserver/blob/wps-filter/src/extension/wps/wps-core/src/main/java/org/geoserver/wps/WPSInfo.java

The process group is really just a reference to a factory, a flag that enables/disables it, and a list
of disabled processes:
https://github.com/aaime/geoserver/blob/wps-filter/src/extension/wps/wps-core/src/main/java/org/geoserver/wps/ProcessGroupInfo.java

The approach of blacklisting processes (as opposed to white list them) has been chosen to allow
upgrading servers to keep on working, as well as give people adding processes via scripting
the ability to use the process right away without having to also enable it.
A authorization subsytem can decide to implement the ProcessFilter interface with a whitelist
approach if it wants to.

Here is some screenshots from the GUI. The main entry point in the WPS admin page:

Inline image 1

and the page that opens to cherry-pick processes from a certain group:

Inline image 2

One notice about the GUI, I had to deep clone the ProcessGroupInfo used in the GUI because
they don’t get wrapped with a proxy, so modifying them was altering the server state even
if you did not push the save button in the WPS admin page.

As far as I can see ModificationProxy simply clones collections, that takes care of adding/removing
items, but not modifying them.
I see there are lists that do wrap their contents too, used for the methods that return lists of info objects
from the catalog, but they are not used inside ModificationProxy “get” property paths.
So far the cloning works good, but it made me wrestle with the code quite a bit before discovering
where the issue was (I was blaming Wicket, for once it wasn’t it :-p )

Cheers
Andrea

==
Our support, Your Success! Visit http://opensdi.geo-solutions.it for more information.

Ing. Andrea Aime
@geowolf
Technical Lead

GeoSolutions S.A.S.
Via Poggio alle Viti 1187
55054 Massarosa (LU)
Italy
phone: +39 0584 962313
fax: +39 0584 962313
mob: +39 339 8844549

http://www.geo-solutions.it
http://twitter.com/geosolutions_it


+1 Andrea

2012/8/10 Andrea Aime <andrea.aime@anonymised.com>

Hi,
I’ve been working on the process selection and I believe it’s ready for commit.
It’s a relatively large patch (97KB) with new API, even if it’s destined for trunk
only right now I believe it would have been good to get some feedback before
moving on.

If you just cannot resist and have a look at the diffs here you go, more description follows:
https://github.com/aaime/geoserver/compare/73f804450c859b2ab1e946baea22fd5153007c65…aaime:wps-filter

So the whole work is based on a very simple interface, here:
https://github.com/aaime/geoserver/blob/wps-filter/src/extension/wps/wps-core/src/main/java/org/geoserver/wps/process/ProcessFilter.java

It is simpler and more generic than the one I presented before, and allows also
to wrap process factories, which means one can not only control which processes are around, but
also lie about process metadata, or add controls for inputs and outputs (in their java object form,
more on this later).
If one really just wants to hide certain processes there is a convenience base class available:
https://github.com/aaime/geoserver/blob/wps-filter/src/extension/wps/wps-core/src/main/java/org/geoserver/wps/process/ProcessSelector.java

The ProcessFilter is pluggable, just register it in the app context (there are tests for this)
and the GeoServerProcessors, which replaces Processors in GeoServer WPS, will use it:
https://github.com/aaime/geoserver/blob/wps-filter/src/extension/wps/wps-core/src/main/java/org/geoserver/wps/process/GeoServerProcessors.java

Configuration wise, the WPSInfo has been extended to have an idea of a “process group”
(which is a factory, but I did not want to put the term “factory” in the GUI and REST user eyes):
https://github.com/aaime/geoserver/blob/wps-filter/src/extension/wps/wps-core/src/main/java/org/geoserver/wps/WPSInfo.java

The process group is really just a reference to a factory, a flag that enables/disables it, and a list
of disabled processes:
https://github.com/aaime/geoserver/blob/wps-filter/src/extension/wps/wps-core/src/main/java/org/geoserver/wps/ProcessGroupInfo.java

The approach of blacklisting processes (as opposed to white list them) has been chosen to allow
upgrading servers to keep on working, as well as give people adding processes via scripting
the ability to use the process right away without having to also enable it.
A authorization subsytem can decide to implement the ProcessFilter interface with a whitelist
approach if it wants to.

Here is some screenshots from the GUI. The main entry point in the WPS admin page:

Inline image 1

and the page that opens to cherry-pick processes from a certain group:

Inline image 2

One notice about the GUI, I had to deep clone the ProcessGroupInfo used in the GUI because
they don’t get wrapped with a proxy, so modifying them was altering the server state even
if you did not push the save button in the WPS admin page.

As far as I can see ModificationProxy simply clones collections, that takes care of adding/removing
items, but not modifying them.
I see there are lists that do wrap their contents too, used for the methods that return lists of info objects
from the catalog, but they are not used inside ModificationProxy “get” property paths.
So far the cloning works good, but it made me wrestle with the code quite a bit before discovering
where the issue was (I was blaming Wicket, for once it wasn’t it :-p )

Cheers
Andrea

==
Our support, Your Success! Visit http://opensdi.geo-solutions.it for more information.

Ing. Andrea Aime
@geowolf
Technical Lead

GeoSolutions S.A.S.
Via Poggio alle Viti 1187
55054 Massarosa (LU)
Italy
phone: +39 0584 962313
fax: +39 0584 962313
mob: +39 339 8844549

http://www.geo-solutions.it
http://twitter.com/geosolutions_it



Live Security Virtual Conference
Exclusive live event will cover all the ways today’s security and
threat landscape has changed and how IT managers can respond. Discussions
will include endpoint security, mobile security and the latest in malware
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/


Geoserver-devel mailing list
Geoserver-devel@anonymised.comsts.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/geoserver-devel


Francesco Izzi
CNR - IMAA
geoSDI
Direzione Tecnologie e Sviluppo

C.da S. Loja
85050 Tito Scalo - POTENZA (PZ)
Italia

phone: +39 0971427305
fax: +39 0971 427271
mob: +39 3203126609
mail: francesco.izzi@anonymised.com
skype: neofx8080

web: http://www.geosdi.org

One question - how does an admin know which group in which to find a process? In other words, is the factory for a process visible anywhere, and does it have a usable identifier (or just the description)?

Was the namespace was meant to act as the visible grouping of processes?

On Fri, Aug 10, 2012 at 8:25 AM, Andrea Aime <andrea.aime@anonymised.com> wrote:

Hi,
I’ve been working on the process selection and I believe it’s ready for commit.
It’s a relatively large patch (97KB) with new API, even if it’s destined for trunk
only right now I believe it would have been good to get some feedback before
moving on.

Martin Davis
OpenGeo - http://opengeo.org
Expert service straight from the developers.

Nice work Andrea. Some comments inline.

On Fri, Aug 10, 2012 at 9:25 AM, Andrea Aime <andrea.aime@anonymised.com> wrote:

Hi,
I’ve been working on the process selection and I believe it’s ready for commit.
It’s a relatively large patch (97KB) with new API, even if it’s destined for trunk
only right now I believe it would have been good to get some feedback before
moving on.

If you just cannot resist and have a look at the diffs here you go, more description follows:
https://github.com/aaime/geoserver/compare/73f804450c859b2ab1e946baea22fd5153007c65…aaime:wps-filter

So the whole work is based on a very simple interface, here:
https://github.com/aaime/geoserver/blob/wps-filter/src/extension/wps/wps-core/src/main/java/org/geoserver/wps/process/ProcessFilter.java

It is simpler and more generic than the one I presented before, and allows also
to wrap process factories, which means one can not only control which processes are around, but
also lie about process metadata, or add controls for inputs and outputs (in their java object form,
more on this later).
If one really just wants to hide certain processes there is a convenience base class available:
https://github.com/aaime/geoserver/blob/wps-filter/src/extension/wps/wps-core/src/main/java/org/geoserver/wps/process/ProcessSelector.java

The ProcessFilter is pluggable, just register it in the app context (there are tests for this)
and the GeoServerProcessors, which replaces Processors in GeoServer WPS, will use it:
https://github.com/aaime/geoserver/blob/wps-filter/src/extension/wps/wps-core/src/main/java/org/geoserver/wps/process/GeoServerProcessors.java

Configuration wise, the WPSInfo has been extended to have an idea of a “process group”
(which is a factory, but I did not want to put the term “factory” in the GUI and REST user eyes):
https://github.com/aaime/geoserver/blob/wps-filter/src/extension/wps/wps-core/src/main/java/org/geoserver/wps/WPSInfo.java

The process group is really just a reference to a factory, a flag that enables/disables it, and a list
of disabled processes:
https://github.com/aaime/geoserver/blob/wps-filter/src/extension/wps/wps-core/src/main/java/org/geoserver/wps/ProcessGroupInfo.java

Also, a minor nit-picky think I see that the public modifier is used sometimes on interfaces and sometimes not… be nice to be consistent. I tend to leave them off since its redundant.

The approach of blacklisting processes (as opposed to white list them) has been chosen to allow
upgrading servers to keep on working, as well as give people adding processes via scripting
the ability to use the process right away without having to also enable it.
A authorization subsytem can decide to implement the ProcessFilter interface with a whitelist
approach if it wants to.

Here is some screenshots from the GUI. The main entry point in the WPS admin page:

Inline image 1

Perhaps a title for this section like “Process groups” might mean more to users than “Process filtering directives”… or maybe just “Process filtering”. Also where do the group descriptions come from? Are they configurable or somehow derived from the factory. It would be nice if the description made consistent use of case.

and the page that opens to cherry-pick processes from a certain group:

Inline image 2

One notice about the GUI, I had to deep clone the ProcessGroupInfo used in the GUI because
they don’t get wrapped with a proxy, so modifying them was altering the server state even
if you did not push the save button in the WPS admin page.

As far as I can see ModificationProxy simply clones collections, that takes care of adding/removing
items, but not modifying them.
I see there are lists that do wrap their contents too, used for the methods that return lists of info objects
from the catalog, but they are not used inside ModificationProxy “get” property paths.
So far the cloning works good, but it made me wrestle with the code quite a bit before discovering
where the issue was (I was blaming Wicket, for once it wasn’t it :-p )

Hmmm… indeed this seems like a hole and happens in cases where an object needs to return a mutable list. The ProxyList is only suitable for immutable lists… it would be nice to handle this in ModificationProxy and have it return not just a cloned collection but a cloned collection that also proxies the objects in the collection. Rather than have a mix of cloning and proxying going on.

Cheers
Andrea

==
Our support, Your Success! Visit http://opensdi.geo-solutions.it for more information.

Ing. Andrea Aime
@geowolf
Technical Lead

GeoSolutions S.A.S.
Via Poggio alle Viti 1187
55054 Massarosa (LU)
Italy
phone: +39 0584 962313
fax: +39 0584 962313
mob: +39 339 8844549

http://www.geo-solutions.it
http://twitter.com/geosolutions_it



Live Security Virtual Conference
Exclusive live event will cover all the ways today’s security and
threat landscape has changed and how IT managers can respond. Discussions
will include endpoint security, mobile security and the latest in malware
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/


Geoserver-devel mailing list
Geoserver-devel@anonymised.comsts.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/geoserver-devel


Justin Deoliveira
OpenGeo - http://opengeo.org
Enterprise support for open source geospatial.

On Fri, Aug 10, 2012 at 6:34 PM, Martin Davis <mdavis@anonymised.com> wrote:

One question - how does an admin know which group in which to find a process? In other words, is the factory for a process visible anywhere, and does it have a usable identifier (or just the description)?

Was the namespace was meant to act as the visible grouping of processes?

Yes and no. A factory is free to create processes in different namespaces, although in
practice one prefix per factory is chosen.
I guess I could add a column “namespace” in the main table that shows the namespace(s)
prefix(es) for that factory.

The reason why there are the factories is partly convenience, since it’s a quick way to
disable a lot of processes, and partly because there are already 80+ processes
and the list is growing, and we never managed to make tables that are both
pageable and selectable in Wicket, so having a way to separate them made sense in this
respect.

Cheers
Andrea

==
Our support, Your Success! Visit http://opensdi.geo-solutions.it for more information.

Ing. Andrea Aime
@geowolf
Technical Lead

GeoSolutions S.A.S.
Via Poggio alle Viti 1187
55054 Massarosa (LU)
Italy
phone: +39 0584 962313
fax: +39 0584 962313
mob: +39 339 8844549

http://www.geo-solutions.it
http://twitter.com/geosolutions_it


On Fri, Aug 10, 2012 at 6:43 PM, Justin Deoliveira <jdeolive@anonymised.com> wrote:

The process group is really just a reference to a factory, a flag that enables/disables it, and a list
of disabled processes:
https://github.com/aaime/geoserver/blob/wps-filter/src/extension/wps/wps-core/src/main/java/org/geoserver/wps/ProcessGroupInfo.java

Also, a minor nit-picky think I see that the public modifier is used sometimes on interfaces and sometimes not… be nice to be consistent. I tend to leave them off since its redundant.

Sure, I can fix that.

Here is some screenshots from the GUI. The main entry point in the WPS admin page:

Perhaps a title for this section like “Process groups” might mean more to users than “Process filtering directives”… or maybe just “Process filtering”. Also where do the group descriptions come from? Are they configurable or somehow derived from the factory. It would be nice if the description made consistent use of case.

They are coming directly from the factory, so the code needs to be modified in order to get what you want.

As far as I can see ModificationProxy simply clones collections, that takes care of adding/removing
items, but not modifying them.
I see there are lists that do wrap their contents too, used for the methods that return lists of info objects
from the catalog, but they are not used inside ModificationProxy “get” property paths.
So far the cloning works good, but it made me wrestle with the code quite a bit before discovering
where the issue was (I was blaming Wicket, for once it wasn’t it :-p )

Hmmm… indeed this seems like a hole and happens in cases where an object needs to return a mutable list. The ProxyList is only suitable for immutable lists… it would be nice to handle this in ModificationProxy and have it return not just a cloned collection but a cloned collection that also proxies the objects in the collection. Rather than have a mix of cloning and proxying going on.

Indeed. Being a bit short on time I went the cloning route, changing the modification proxy can lead to unexpected issues
in other areas, and for the time being I managed to get things going with a handful of lines of code, but I agree the
ModificationProxy needs to be fixed.
http://jira.codehaus.org/browse/GEOS-5264

Btw, the issue is not on wheter the list is mutable or not (addition/removal is handled properly) but wheter the
contents of the list are mutable or not.
Wondering if the modification proxy should be informed with an annotation, or should just try to wrap whatever
list contents are found.

Cheers
Andrea

==
Our support, Your Success! Visit http://opensdi.geo-solutions.it for more information.

Ing. Andrea Aime
@geowolf
Technical Lead

GeoSolutions S.A.S.
Via Poggio alle Viti 1187
55054 Massarosa (LU)
Italy
phone: +39 0584 962313
fax: +39 0584 962313
mob: +39 339 8844549

http://www.geo-solutions.it
http://twitter.com/geosolutions_it


On Fri, Aug 10, 2012 at 9:43 AM, Andrea Aime <andrea.aime@anonymised.com> wrote:

On Fri, Aug 10, 2012 at 6:34 PM, Martin Davis <mdavis@anonymised.com> wrote:

One question - how does an admin know which group in which to find a process? In other words, is the factory for a process visible anywhere, and does it have a usable identifier (or just the description)?

Was the namespace was meant to act as the visible grouping of processes?

Yes and no. A factory is free to create processes in different namespaces, although in
practice one prefix per factory is chosen.
I guess I could add a column “namespace” in the main table that shows the namespace(s)
prefix(es) for that factory.

That would be helpful, I think.

One related issue around process namespace usage is that there is no way to specify a namespace for an Annotation-driven process. AFAIS these always get assigned to the “gs” namespace? It would be nice to be able to specify a different namespace when creating custom processes.

The reason why there are the factories is partly convenience, since it’s a quick way to
disable a lot of processes, and partly because there are already 80+ processes
and the list is growing, and we never managed to make tables that are both
pageable and selectable in Wicket, so having a way to separate them made sense in this
respect.

Agreed, there needs to be some way of grouping processes to make them more manageable in various contexts. The fact that factories provide description metadata is nice.

Cheers

Andrea

==
Our support, Your Success! Visit http://opensdi.geo-solutions.it for more information.

Ing. Andrea Aime
@geowolf
Technical Lead

GeoSolutions S.A.S.
Via Poggio alle Viti 1187
55054 Massarosa (LU)
Italy
phone: +39 0584 962313
fax: +39 0584 962313
mob: +39 339 8844549

http://www.geo-solutions.it
http://twitter.com/geosolutions_it


Martin Davis
OpenGeo - http://opengeo.org
Expert service straight from the developers.

On Fri, Aug 10, 2012 at 8:50 PM, Martin Davis <mdavis@anonymised.com> wrote:

That would be helpful, I think.

Cool, I can do that

One related issue around process namespace usage is that there is no way to specify a namespace for an Annotation-driven process. AFAIS these always get assigned to the “gs” namespace? It would be nice to be able to specify a different namespace when creating custom processes.

I’m afraid this would be confusing, why not roll your factory if you are building custom processes instead?
You can specify title and prefix in the constructor.
The “gs” factory was really meant for GeoServer specific processes (it expanded beyond that
mostly due to the convenience of registering yet another process there).

If we start having processes in the same factory with different prefixes the issue you were talking
about above, being difficult to guess in which “group” a process falls into, will only get worse.

Cheers
Andrea

==
Our support, Your Success! Visit http://opensdi.geo-solutions.it for more information.

Ing. Andrea Aime
@geowolf
Technical Lead

GeoSolutions S.A.S.
Via Poggio alle Viti 1187
55054 Massarosa (LU)
Italy
phone: +39 0584 962313
fax: +39 0584 962313
mob: +39 339 8844549

http://www.geo-solutions.it
http://twitter.com/geosolutions_it


On Fri, Aug 10, 2012 at 12:11 PM, Andrea Aime <andrea.aime@anonymised.com> wrote:

On Fri, Aug 10, 2012 at 8:50 PM, Martin Davis <mdavis@anonymised.com> wrote:

That would be helpful, I think.

Cool, I can do that

One related issue around process namespace usage is that there is no way to specify a namespace for an Annotation-driven process. AFAIS these always get assigned to the “gs” namespace? It would be nice to be able to specify a different namespace when creating custom processes.

I’m afraid this would be confusing, why not roll your factory if you are building custom processes instead?

Because it’s more code to write, and more complexity to explain to developers. Why can’t the namespace be specified by annotations as well? Or is there a simple way of creating a factory containing a set of processes? Perhaps using Spring beans?

You can specify title and prefix in the constructor.
The “gs” factory was really meant for GeoServer specific processes (it expanded beyond that
mostly due to the convenience of registering yet another process there).

Er - yes. That sort of proves my point. 8^)

If we start having processes in the same factory with different prefixes the issue you were talking
about above, being difficult to guess in which “group” a process falls into, will only get worse.

+1 on forcing factories to only use a single namespace. 8^)

Couldn’t the Annotation-driven framework create a different factory for each namespace it found?

Longer-term, it’s not clear to me that factory is the only or best method of grouping processes. It ties grouping too closely to an implementation detail, I think.

Cheers

Andrea

==
Our support, Your Success! Visit http://opensdi.geo-solutions.it for more information.

Ing. Andrea Aime
@geowolf
Technical Lead

GeoSolutions S.A.S.
Via Poggio alle Viti 1187
55054 Massarosa (LU)
Italy
phone: +39 0584 962313
fax: +39 0584 962313
mob: +39 339 8844549

http://www.geo-solutions.it
http://twitter.com/geosolutions_it


Martin Davis
OpenGeo - http://opengeo.org
Expert service straight from the developers.

On Fri, Aug 10, 2012 at 1:11 PM, Andrea Aime <andrea.aime@anonymised.com> wrote:

On Fri, Aug 10, 2012 at 8:50 PM, Martin Davis <mdavis@anonymised.com> wrote:

That would be helpful, I think.

Cool, I can do that

One related issue around process namespace usage is that there is no way to specify a namespace for an Annotation-driven process. AFAIS these always get assigned to the “gs” namespace? It would be nice to be able to specify a different namespace when creating custom processes.

I’m afraid this would be confusing, why not roll your factory if you are building custom processes instead?
You can specify title and prefix in the constructor.
The “gs” factory was really meant for GeoServer specific processes (it expanded beyond that
mostly due to the convenience of registering yet another process there).

If we start having processes in the same factory with different prefixes the issue you were talking
about above, being difficult to guess in which “group” a process falls into, will only get worse.

I actually think the namespace and the group should be decoupled. I look at a process namespace as publishing metadata rather than a grouping mechanism although i can see why its used for that.

It seems like unnecessary overhead to have to write a factory just to set up a custom namespace. And it goes against making things as easy as possible to write a custom process, like with the annotation based process stuff which hides the factory from you. Also thinking of the scripting extension here where it’s one process factory that loads scripts. I would be +1 for being able to define it at the process level.

Others may disagree but i also don’t feel like the current grouping is all that unintuitive… it actually seems rather arbitrary which I think is a strong argument for allowing “group” information to be specified on the process itself since it knows which groups it will fall into best… Not by any means saying that needs to be done now but I think at some point it makes sense to rethink the grouping strategy perhaps coming up with a well defined “taxonomy” if you will for processes which could potentially be much more finer grained than what we have now.

$0.02

Cheers

Andrea

==
Our support, Your Success! Visit http://opensdi.geo-solutions.it for more information.

Ing. Andrea Aime
@geowolf
Technical Lead

GeoSolutions S.A.S.
Via Poggio alle Viti 1187
55054 Massarosa (LU)
Italy
phone: +39 0584 962313
fax: +39 0584 962313
mob: +39 339 8844549

http://www.geo-solutions.it
http://twitter.com/geosolutions_it



Live Security Virtual Conference
Exclusive live event will cover all the ways today’s security and
threat landscape has changed and how IT managers can respond. Discussions
will include endpoint security, mobile security and the latest in malware
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/


Geoserver-devel mailing list
Geoserver-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/geoserver-devel


Justin Deoliveira
OpenGeo - http://opengeo.org
Enterprise support for open source geospatial.

On Fri, Aug 10, 2012 at 10:40 PM, Justin Deoliveira <jdeolive@anonymised.com> wrote:

If we start having processes in the same factory with different prefixes the issue you were talking
about above, being difficult to guess in which “group” a process falls into, will only get worse.

I actually think the namespace and the group should be decoupled. I look at a process namespace as publishing metadata rather than a grouping mechanism although i can see why its used for that.

It seems like unnecessary overhead to have to write a factory just to set up a custom namespace. And it goes against making things as easy as possible to write a custom process, like with the annotation based process stuff which hides the factory from you. Also thinking of the scripting extension here where it’s one process factory that loads scripts. I would be +1 for being able to define it at the process level.

Others may disagree but i also don’t feel like the current grouping is all that unintuitive… it actually seems rather arbitrary which I think is a strong argument for allowing “group” information to be specified on the process itself since it knows which groups it will fall into best… Not by any means saying that needs to be done now but I think at some point it makes sense to rethink the grouping strategy perhaps coming up with a well defined “taxonomy” if you will for processes which could potentially be much more finer grained than what we have now.

This line of thinking will require changes to the GeoTools process API to be implemented and as a consequence to the work I’ve just completed,
and if pushed to the end will break backwards compatibility for existing users (that is, moving processes around in different namespaces,
changing their unique name).

The current situation is as follows. The factory is the only grouping element that can be found, and the only one that has a Title
that can describe itself (that is used in the main process selection panel).

We can factories that publish in different “namespaces” (actually, they are just prefixes so far), but unless there is a change
in the API you won’t be able to give a title to those prefixes, there is no way to describe it, so you actually get no group
you can speak about with users.

Let’s say you add a way for factories to report a title per prefix, and annotations that allow you to specify
that on a per process basis. How do you reach consistency that way?
Before you know you’ll get the same prefix associated with 4 different titles due to typos or cut and paste errors.
Wouldn’t it be better to make it easier to create your own factory instead, so that you specify the group metadata
just once?
Imho it’s not hard to do so today, there are two ways:

  • create a 5 lines class and register it in the app context:

public class FooFactory extends SpringBeanProcessFactory {
public FooFactory() {
super(“foo is cool”, “foo”, FooProcess.class);
}
}

  • for scripting languages, you could use the ability to register factories at runtime with Processors.addProcessFactory(…)

Changing topic and moving to the existing process grouping, it is was originally based on source instead of function,
GeoServer processes, JTS processes, GRASS processes, Sextante processes and so on (which is not an uncommon
way, look at 52N and QGis for example), while moving to GeoTools Jody tried to rewire it so that it’s based on function
instead and we ended up with the current mixed bag.
Imho the “per function” approach is not so easy to maintain, it works for our own processes, but hopefully
one day someone will write a factory that binds to GRASS, will that person have to pick the hundreds of
processes over there and assign each of them to the proper group?
GRASS and Sextante both organize processes per data type or functional area, but neither integrates
processes from outside, while systems that integrate processes from outside normally do so by organizing
them by source (QGis, 52N).
GeoServer seems to be in a position to have both, maybe what’s needed is a double level classification,
both source and functional area, “gs.v.Bounds”, “gs.r2v.Contour”, and so on?

However changing the process names now will break all existing applications that use WPS,
given that WPS is in supported land this is bad.
How do we get a new classification without breaking existing users?

Cheers
Andrea

==
Our support, Your Success! Visit http://opensdi.geo-solutions.it for more information.

Ing. Andrea Aime
@geowolf
Technical Lead

GeoSolutions S.A.S.
Via Poggio alle Viti 1187
55054 Massarosa (LU)
Italy
phone: +39 0584 962313
fax: +39 0584 962313
mob: +39 339 8844549

http://www.geo-solutions.it
http://twitter.com/geosolutions_it


Right… good points. Some additional thoughts.

Naturally I wouldn’t propose anything that would change process identifiers and break backwards compatibility. Right now we have basically the following metadata:

  • id/name
  • title
  • description

A way to deal with backwards compatibility might be to add a list of aliases for the processes. So say we decided on a different grouping that would change a process namespace we could maintain the old ones as an alias. As long as aliases and names are unique it shouldn’t be an issue.

But that said, depending on who you ask you would probably get a different set of groupings so its probably not possible to come up with a single categorization.

For instance looking at GRASS and arctoolbox I see different break downs.

http://grass.osgeo.org/gdp/html_grass63/vectorintro.html
http://webhelp.esri.com/arcgisdesktop/9.2/index.cfm?TopicName=Geoprocessing_tools

So maybe grouping by source or keeping them like they are now makes most sense but also allow for additional metadata to define the groups/categories. Like basically a list groups a process could be considered a part of. Groups could be very coarse grained like “raster” or “vector” or more fine grained like “network analysis”, etc…

In terms of api changes it would be an added method to the ProcessFactory interface:

ProcessFactory {
List getGroups(Name);
}

And then an api to look up processes by a group. Basically it would be a “tagging” scheme. Changes would be mostly additive and i don’t think would have any backward compatibility issues.

I am thinking of this from the perspective of something trying to build a ui out of the current process listing. And things may be usable despite the grouping the way they are now but as the number of processes grow I think it will become more of a problem.

As for the scripting module i think multiple factories would make things quite a bit more complex… compiled scripts are cached for obvious performance reasons and the cache is currently stored at the factory level although I guess it could be factored out. But then i would have to manage the dropping of factories as scripts change over time, etc… possibly determining if a script writer changes the “namespace” for a script.

$0.02

On Sat, Aug 11, 2012 at 12:48 AM, Andrea Aime <andrea.aime@anonymised.com> wrote:

On Fri, Aug 10, 2012 at 10:40 PM, Justin Deoliveira <jdeolive@anonymised.com.> wrote:

If we start having processes in the same factory with different prefixes the issue you were talking
about above, being difficult to guess in which “group” a process falls into, will only get worse.

I actually think the namespace and the group should be decoupled. I look at a process namespace as publishing metadata rather than a grouping mechanism although i can see why its used for that.

It seems like unnecessary overhead to have to write a factory just to set up a custom namespace. And it goes against making things as easy as possible to write a custom process, like with the annotation based process stuff which hides the factory from you. Also thinking of the scripting extension here where it’s one process factory that loads scripts. I would be +1 for being able to define it at the process level.

Others may disagree but i also don’t feel like the current grouping is all that unintuitive… it actually seems rather arbitrary which I think is a strong argument for allowing “group” information to be specified on the process itself since it knows which groups it will fall into best… Not by any means saying that needs to be done now but I think at some point it makes sense to rethink the grouping strategy perhaps coming up with a well defined “taxonomy” if you will for processes which could potentially be much more finer grained than what we have now.

This line of thinking will require changes to the GeoTools process API to be implemented and as a consequence to the work I’ve just completed,
and if pushed to the end will break backwards compatibility for existing users (that is, moving processes around in different namespaces,
changing their unique name).

The current situation is as follows. The factory is the only grouping element that can be found, and the only one that has a Title
that can describe itself (that is used in the main process selection panel).

We can factories that publish in different “namespaces” (actually, they are just prefixes so far), but unless there is a change
in the API you won’t be able to give a title to those prefixes, there is no way to describe it, so you actually get no group
you can speak about with users.

Let’s say you add a way for factories to report a title per prefix, and annotations that allow you to specify
that on a per process basis. How do you reach consistency that way?
Before you know you’ll get the same prefix associated with 4 different titles due to typos or cut and paste errors.
Wouldn’t it be better to make it easier to create your own factory instead, so that you specify the group metadata
just once?
Imho it’s not hard to do so today, there are two ways:

  • create a 5 lines class and register it in the app context:

public class FooFactory extends SpringBeanProcessFactory {
public FooFactory() {
super(“foo is cool”, “foo”, FooProcess.class);
}
}

  • for scripting languages, you could use the ability to register factories at runtime with Processors.addProcessFactory(…)

Changing topic and moving to the existing process grouping, it is was originally based on source instead of function,
GeoServer processes, JTS processes, GRASS processes, Sextante processes and so on (which is not an uncommon
way, look at 52N and QGis for example), while moving to GeoTools Jody tried to rewire it so that it’s based on function
instead and we ended up with the current mixed bag.
Imho the “per function” approach is not so easy to maintain, it works for our own processes, but hopefully
one day someone will write a factory that binds to GRASS, will that person have to pick the hundreds of
processes over there and assign each of them to the proper group?
GRASS and Sextante both organize processes per data type or functional area, but neither integrates
processes from outside, while systems that integrate processes from outside normally do so by organizing
them by source (QGis, 52N).
GeoServer seems to be in a position to have both, maybe what’s needed is a double level classification,
both source and functional area, “gs.v.Bounds”, “gs.r2v.Contour”, and so on?

However changing the process names now will break all existing applications that use WPS,
given that WPS is in supported land this is bad.
How do we get a new classification without breaking existing users?

Cheers

Andrea

==
Our support, Your Success! Visit http://opensdi.geo-solutions.it for more information.

Ing. Andrea Aime
@geowolf
Technical Lead

GeoSolutions S.A.S.
Via Poggio alle Viti 1187
55054 Massarosa (LU)
Italy
phone: +39 0584 962313
fax: +39 0584 962313
mob: +39 339 8844549

http://www.geo-solutions.it
http://twitter.com/geosolutions_it



Justin Deoliveira
OpenGeo - http://opengeo.org
Enterprise support for open source geospatial.

On Sat, Aug 11, 2012 at 5:03 PM, Justin Deoliveira <jdeolive@anonymised.com> wrote:

Right… good points. Some additional thoughts.

Naturally I wouldn’t propose anything that would change process identifiers and break backwards compatibility. Right now we have basically the following metadata:

  • id/name
  • title
  • description

A way to deal with backwards compatibility might be to add a list of aliases for the processes. So say we decided on a different grouping that would change a process namespace we could maintain the old ones as an alias. As long as aliases and names are unique it shouldn’t be an issue.

Yes. At the GeoServer level this could be done in a ProcessFitler, at the GeoTools level that won’t work, but maybe we can just
change the names there given the process modules are still in unsupported land (will need to double check).

But that said, depending on who you ask you would probably get a different set of groupings so its probably not possible to come up with a single categorization.

Right.

So maybe grouping by source or keeping them like they are now makes most sense but also allow for additional metadata to define the groups/categories. Like basically a list groups a process could be considered a part of. Groups could be very coarse grained like “raster” or “vector” or more fine grained like “network analysis”, etc…

In terms of api changes it would be an added method to the ProcessFactory interface:

ProcessFactory {
List getGroups(Name);
}

And then an api to look up processes by a group. Basically it would be a “tagging” scheme. Changes would be mostly additive and i don’t think would have any backward compatibility issues.

Yes, I see where you’re going… but… read below

I am thinking of this from the perspective of something trying to build a ui out of the current process listing. And things may be usable despite the grouping the way they are now but as the number of processes grow I think it will become more of a problem.

The trouble with a process being applicable to multiple groups is that it would show up multiple
times in the GUI, and it would not be possible to “disable a group” anymore, what you would
do with a process that is in both a disabled group, and an enabled one?

My suggestion: if people are so uncomfortable with the current grouping let’s make a new one,
but have it be just one, and possibly quickly, and let’s use ProcessFilter to stand up aliases
so that old client can keep on working.

I really want to commit this process filtering thing and this discussion is putting a wrench into it,
also wanted to backport it to 2.2.x sooner rather than later to get closer to a WPS can be
exposed on the internet but at least the latter cannot be done until we settle this discussion.
Once people feel comfortable about the grouping I’ll push for the process API to be
graduated to supported status in GeoTools, and at that point the grouping will be cast
in stone for the foreseeable future.

Cheers
Andrea

==
Our support, Your Success! Visit http://opensdi.geo-solutions.it for more information.

Ing. Andrea Aime
@geowolf
Technical Lead

GeoSolutions S.A.S.
Via Poggio alle Viti 1187
55054 Massarosa (LU)
Italy
phone: +39 0584 962313
fax: +39 0584 962313
mob: +39 339 8844549

http://www.geo-solutions.it
http://twitter.com/geosolutions_it


On Mon, Aug 13, 2012 at 4:08 AM, Andrea Aime <andrea.aime@anonymised.com> wrote:

On Sat, Aug 11, 2012 at 5:03 PM, Justin Deoliveira <jdeolive@anonymised.com> wrote:

Right… good points. Some additional thoughts.

Naturally I wouldn’t propose anything that would change process identifiers and break backwards compatibility. Right now we have basically the following metadata:

  • id/name
  • title
  • description

A way to deal with backwards compatibility might be to add a list of aliases for the processes. So say we decided on a different grouping that would change a process namespace we could maintain the old ones as an alias. As long as aliases and names are unique it shouldn’t be an issue.

Yes. At the GeoServer level this could be done in a ProcessFitler, at the GeoTools level that won’t work, but maybe we can just
change the names there given the process modules are still in unsupported land (will need to double check).

But that said, depending on who you ask you would probably get a different set of groupings so its probably not possible to come up with a single categorization.

Right.

So maybe grouping by source or keeping them like they are now makes most sense but also allow for additional metadata to define the groups/categories. Like basically a list groups a process could be considered a part of. Groups could be very coarse grained like “raster” or “vector” or more fine grained like “network analysis”, etc…

In terms of api changes it would be an added method to the ProcessFactory interface:

ProcessFactory {
List getGroups(Name);
}

And then an api to look up processes by a group. Basically it would be a “tagging” scheme. Changes would be mostly additive and i don’t think would have any backward compatibility issues.

Yes, I see where you’re going… but… read below

I am thinking of this from the perspective of something trying to build a ui out of the current process listing. And things may be usable despite the grouping the way they are now but as the number of processes grow I think it will become more of a problem.

The trouble with a process being applicable to multiple groups is that it would show up multiple
times in the GUI, and it would not be possible to “disable a group” anymore, what you would
do with a process that is in both a disabled group, and an enabled one?

I actually don’t see this as such a big issue. It is up to the person writing the gui to choose the categories that make sense. There would be two sets of grouping sin in this “tagged” system. Grouping by source/factory in which the individual groups would be mutually exclusive and grouping by “tag” which would not be. As you say it probably makes sense only to allow disabling based on source, rather than tag.

My suggestion: if people are so uncomfortable with the current grouping let’s make a new one,
but have it be just one, and possibly quickly, and let’s use ProcessFilter to stand up aliases
so that old client can keep on working.

While I think a better source grouping is probably possible I don’t think it solves the problem and as mentioned before i think coming up with the grouping will be hard. And trying to come up with a grouping that will scale to future processes will be even harder. But perhaps we could come up with a very coarse grouping for now. Like even just “geometry”, “raster” and “vector”.

I really want to commit this process filtering thing and this discussion is putting a wrench into it,
also wanted to backport it to 2.2.x sooner rather than later to get closer to a WPS can be
exposed on the internet but at least the latter cannot be done until we settle this discussion.
Once people feel comfortable about the grouping I’ll push for the process API to be
graduated to supported status in GeoTools, and at that point the grouping will be cast
in stone for the foreseeable future.

I can sympathize but you asked for feedback on this topic and this is something that we have a vested interest in. Perhaps if more free reign is needed in the module it should be moved back into community space.

That said I don’t think any of this discussion blocks your current work. In my mind the process filtering is orthogonal to categorization with additional metadata. I am fine with starting a separate topic for that and holding off the discussion until a later time in which we have a more formal mandate / proposal.

Cheers
Andrea

==
Our support, Your Success! Visit http://opensdi.geo-solutions.it for more information.

Ing. Andrea Aime
@geowolf
Technical Lead

GeoSolutions S.A.S.
Via Poggio alle Viti 1187
55054 Massarosa (LU)
Italy
phone: +39 0584 962313
fax: +39 0584 962313
mob: +39 339 8844549

http://www.geo-solutions.it
http://twitter.com/geosolutions_it



Justin Deoliveira
OpenGeo - http://opengeo.org
Enterprise support for open source geospatial.

On Mon, Aug 13, 2012 at 3:34 PM, Justin Deoliveira <jdeolive@anonymised.com> wrote:

I really want to commit this process filtering thing and this discussion is putting a wrench into it,
also wanted to backport it to 2.2.x sooner rather than later to get closer to a WPS can be
exposed on the internet but at least the latter cannot be done until we settle this discussion.
Once people feel comfortable about the grouping I’ll push for the process API to be
graduated to supported status in GeoTools, and at that point the grouping will be cast
in stone for the foreseeable future.

I can sympathize but you asked for feedback on this topic and this is something that we have a vested interest in. Perhaps if more free reign is needed in the module it should be moved back into community space.

Justin, during the last years I’ve almost single handedly developed WPS (besides your precious help setting up PPIO
and XML parsing, which I appreciated a lot) asking the community for feedback at basically every single step in the road.
That’s why I’m so surprised in getting negative feedback now for something that was laid out in the open years ago.

That said I don’t think any of this discussion blocks your current work. In my mind the process filtering is orthogonal to categorization with additional metadata. I am fine with starting a separate topic for that and holding off the discussion until a later time in which we have a more formal mandate / proposal.

That would help… but I’m getting confused about the current work and future changes.

Say we commit the code as is, and the current process factory structure is exposed in the configuration
(just as it’s already exposed with the current name set).

At some point someone comes in with a proposal for tags… where would these tags show up?
It seems to me you are not talking about GeoServer anymore, but about some external front-end to
the processes?

You say we could have the processes follow a coarse classification, I would be fine with that.
How about geo (geometry), ft (features), rs (rasters), tx (raster to vector and vice-versa)?
Would that be clean enough to be preserved in the future?

While I’m at it I would also remove the GeoServer prefix/namespace, which caused so much
discussion and discontent so far.

Cheers
Andrea

==
Our support, Your Success! Visit http://opensdi.geo-solutions.it for more information.

Ing. Andrea Aime
@geowolf
Technical Lead

GeoSolutions S.A.S.
Via Poggio alle Viti 1187
55054 Massarosa (LU)
Italy
phone: +39 0584 962313
fax: +39 0584 962313
mob: +39 339 8844549

http://www.geo-solutions.it
http://twitter.com/geosolutions_it


On Mon, Aug 13, 2012 at 3:52 PM, Andrea Aime <andrea.aime@anonymised.com> wrote:

I can sympathize but you asked for feedback on this topic and this is something that we have a vested interest in. Perhaps if more free reign is needed in the module it should be moved back into community space.

Justin, during the last years I’ve almost single handedly developed WPS (besides your precious help setting up PPIO
and XML parsing, which I appreciated a lot) asking the community for feedback at basically every single step in the road.
That’s why I’m so surprised in getting negative feedback now for something that was laid out in the open years ago.

Furthermore, the layout of the factories as it is was present in the WPS module the day I asked the PSC to
graduate the module to supported status, and I’m not the one asking for changes in it: I’m not really the one that is
asking for “more free reign” here, am I?

Cheers
Anrea

==
Our support, Your Success! Visit http://opensdi.geo-solutions.it for more information.

Ing. Andrea Aime
@geowolf
Technical Lead

GeoSolutions S.A.S.
Via Poggio alle Viti 1187
55054 Massarosa (LU)
Italy
phone: +39 0584 962313
fax: +39 0584 962313
mob: +39 339 8844549

http://www.geo-solutions.it
http://twitter.com/geosolutions_it


On Mon, Aug 13, 2012 at 7:52 AM, Andrea Aime <andrea.aime@anonymised.com> wrote:

On Mon, Aug 13, 2012 at 3:34 PM, Justin Deoliveira <jdeolive@anonymised.com> wrote:

I really want to commit this process filtering thing and this discussion is putting a wrench into it,
also wanted to backport it to 2.2.x sooner rather than later to get closer to a WPS can be
exposed on the internet but at least the latter cannot be done until we settle this discussion.
Once people feel comfortable about the grouping I’ll push for the process API to be
graduated to supported status in GeoTools, and at that point the grouping will be cast
in stone for the foreseeable future.

I can sympathize but you asked for feedback on this topic and this is something that we have a vested interest in. Perhaps if more free reign is needed in the module it should be moved back into community space.

Justin, during the last years I’ve almost single handedly developed WPS (besides your precious help setting up PPIO
and XML parsing, which I appreciated a lot) asking the community for feedback at basically every single step in the road.
That’s why I’m so surprised in getting negative feedback now for something that was laid out in the open years ago.

I am sorry you are interpreting my feedback as negative, it is not meant to be. You asked for feedback and I am trying to present some and have already stated that this is an idea and just a discussion that is not a negative vote against your original proposal. And certainly i recognize your superhuman efforts over the past few years with wps (and many other areas)… without it wps/processing would still just be an idea.

That said I don’t think any of this discussion blocks your current work. In my mind the process filtering is orthogonal to categorization with additional metadata. I am fine with starting a separate topic for that and holding off the discussion until a later time in which we have a more formal mandate / proposal.

That would help… but I’m getting confused about the current work and future changes.

Say we commit the code as is, and the current process factory structure is exposed in the configuration
(just as it’s already exposed with the current name set).

At some point someone comes in with a proposal for tags… where would these tags show up?
It seems to me you are not talking about GeoServer anymore, but about some external front-end to
the processes?

Indeed I am more targeting geotools with this idea and thinking of apps on top it other than just geoserver such as udig, geoscript, front-end javascript apps, etc…

For instance I think such a system could be used to implement an equivalent to arc toolbox in udig, etc…

You say we could have the processes follow a coarse classification, I would be fine with that.
How about geo (geometry), ft (features), rs (rasters), tx (raster to vector and vice-versa)?
Would that be clean enough to be preserved in the future?

Yup, i like that grouping.

While I’m at it I would also remove the GeoServer prefix/namespace, which caused so much
discussion and discontent so far.

Cheers

Andrea

==
Our support, Your Success! Visit http://opensdi.geo-solutions.it for more information.

Ing. Andrea Aime
@geowolf
Technical Lead

GeoSolutions S.A.S.
Via Poggio alle Viti 1187
55054 Massarosa (LU)
Italy
phone: +39 0584 962313
fax: +39 0584 962313
mob: +39 339 8844549

http://www.geo-solutions.it
http://twitter.com/geosolutions_it



Justin Deoliveira
OpenGeo - http://opengeo.org
Enterprise support for open source geospatial.

On Mon, Aug 13, 2012 at 7:55 AM, Andrea Aime <andrea.aime@anonymised.com> wrote:

On Mon, Aug 13, 2012 at 3:52 PM, Andrea Aime <andrea.aime@anonymised.com> wrote:

I can sympathize but you asked for feedback on this topic and this is something that we have a vested interest in. Perhaps if more free reign is needed in the module it should be moved back into community space.

Justin, during the last years I’ve almost single handedly developed WPS (besides your precious help setting up PPIO
and XML parsing, which I appreciated a lot) asking the community for feedback at basically every single step in the road.
That’s why I’m so surprised in getting negative feedback now for something that was laid out in the open years ago.

Furthermore, the layout of the factories as it is was present in the WPS module the day I asked the PSC to
graduate the module to supported status, and I’m not the one asking for changes in it: I’m not really the one that is
asking for “more free reign” here, am I?

I guess i was under the mistaken impression that things could change over time. Looking over the proposal I don’t see anything explicit about the grouping of processes but you are correct, it would have been better if i had thought of it at that time and presented feedback then.

Anyways, given the argumentative tone of this discussion I will “withdrawl” my original idea and drop the topic.

Cheers
Anrea

==
Our support, Your Success! Visit http://opensdi.geo-solutions.it for more information.

Ing. Andrea Aime
@geowolf
Technical Lead

GeoSolutions S.A.S.
Via Poggio alle Viti 1187
55054 Massarosa (LU)
Italy
phone: +39 0584 962313
fax: +39 0584 962313
mob: +39 339 8844549

http://www.geo-solutions.it
http://twitter.com/geosolutions_it



Justin Deoliveira
OpenGeo - http://opengeo.org
Enterprise support for open source geospatial.