[Geoserver-devel] (jdbc) resource store: platform additions

Hi Everyone,

We have a pull request for the jdbc resource store (as well as some additional resources port commits )
See https://github.com/geoserver/geoserver/pull/1361

It has been fully reviewed by Ben, but there is some new resources related stuff added to the platform module that I wanted to check with the community as well.

1) There is kind of some new API, but it is completely separate from the existing API (and does not alter any existing API in any way): the ResourceWatcher interface with SimpleResourceWatcher implementation.
See this commit: https://github.com/NielsCharlier/geoserver/commit/3d36223546bea462d972e931152dc07a2b5039e7

This is an interface/class that, completely separately from the file system (hence it is not used by the default resource store implementations) store resource listeners, receives resource notifications and sends these to the listeners. The interface allows us to implement a clustered version in the hzcluster module that works perfectly together with the jdbcresourcestore without either of those two modules depending on each other.

It is not being used for anything else as of yet but it can be used by other resource implementations and other cluster implementations as well.

2) Following up on the resources port, I have also added some new utility methods to the Resources class.
https://github.com/NielsCharlier/geoserver/commit/daa934dcacf3064cf14ff5d32f0d16537e7f3598

Previously, we had Files.url(String)->File, taking a url as a String and returning a File. And we had GeoServerDataDirectory.urlToResource(URL)->Resource which supports a custom "resource:" scheme (which was unsupported by the url(String) method).

Instead, I have created instead Resources.fromURL(URL)->Resource and Resources.fromURL(String)->Resource which both support the "resource:" scheme (in analogy with the already earlier added fromPath(String)).
The Files.url(String)->File is now calling Resources.fromURL() and has been deprecated.

Kind Regards
Niels

Thanks Niels for taking this to the development list, I would like to ask you to bring these things up as they occur to you (rather than wait for the pull request review ) this will make your pull requests easier to review and more efficient for everyone here.

Or you can work like Justin and bring things to the dev list after you have a POC in hand.

Moving on to the two changes …

  1. ResourceWatcher

I need a chance to look at ResourceWatcher - as I thought we had this covered already with ResourceListener. I figured each ResourceStore should be responsible for its own events … since they would need to watch the file system or pay attention to service notifications etc…

Are you doing this for a technical reason (code reuse) or an architectural reason (change to design)?

  • technical reason - please just reuse an implementation class as a delegate
  • architecture reason: I was trying to have ResourceStore be the single source of truth for resources (including change notification). Is there a reason why this should change?

I had a technical reason for wanting a new FileSystem watcher created - that uses the Java 7 watcher API to monitor the file system (our current implementation just has a thread and checks timestamps).

Sounds like you have an architecture reason - wanting to inject an alternate implementation for event dispatch when working in a cluster.

Two questions:

  • If you are really focused on event dispatch can we rename the interface to match?
  • Should we have an event dispatch mechanism that works in a similar fashion for catalog and resource change events?
  1. New utility methods

Resources.toURL( resource )

  • makes a lot of sense and reads better in the code

I am not too worried about additional utility methods, instead I am concerned about changes to objects we may have to implement (such as GeoServerResourceLoader).

GeoServerResourceLoader.fromURL(String url)
GeoServerResourceLoader.fromURL(URL url)

  • yeah I know you did not write this - just added one that takes a URL (do you really need both?)
  • these are the worst methods in the mix - since they have been asked to do so much by so many bits of code. Like you can read the javadocs and still not understand what will happen

GeoServerResourceLoader.fromPath(String path) {

  • I like that it is consistent with the from URL methods…
  • These are a bit awkward as they add API to GeoServerResourceLoader (trying to encourage code to just use ResourceStore and utility classes)
  • Actually fromPath just delegate to Resources utility class …

public Resource fromPath(String path) {
return Resources.fromPath(path, resources.get(Paths.BASE));
}

So no need to add helper method here use:

Resource r = Resources.fromPath( path, loader.get(Paths.BASE));

Only thing I can think of is that you are trying to hide the fact we have a DATA_DIRECTORY used as starting point?

···

On 16 December 2015 at 02:11, Niels Charlier <niels@anonymised.com> wrote:

Hi Everyone,

We have a pull request for the jdbc resource store (as well as some additional resources port commits )
See https://github.com/geoserver/geoserver/pull/1361

It has been fully reviewed by Ben, but there is some new resources related stuff added to the platform module that I wanted to check with the community as well.

  1. There is kind of some new API, but it is completely separate from the existing API (and does not alter any existing API in any way): the ResourceWatcher interface with SimpleResourceWatcher implementation.
    See this commit: https://github.com/NielsCharlier/geoserver/commit/3d36223546bea462d972e931152dc07a2b5039e7

This is an interface/class that, completely separately from the file system (hence it is not used by the default resource store implementations) store resource listeners, receives resource notifications and sends these to the listeners. The interface allows us to implement a clustered version in the hzcluster module that works perfectly together with the jdbcresourcestore without either of those two modules depending on each other.

It is not being used for anything else as of yet but it can be used by other resource implementations and other cluster implementations as well.

  1. Following up on the resources port, I have also added some new utility methods to the Resources class.
    https://github.com/NielsCharlier/geoserver/commit/daa934dcacf3064cf14ff5d32f0d16537e7f3598

Previously, we had Files.url(String)->File, taking a url as a String and returning a File. And we had GeoServerDataDirectory.urlToResource(URL)->Resource which supports a custom “resource:” scheme (which was unsupported by the url(String) method).

Instead, I have created instead Resources.fromURL(URL)->Resource and Resources.fromURL(String)->Resource which both support the “resource:” scheme (in analogy with the already earlier added fromPath(String)).
The Files.url(String)->File is now calling Resources.fromURL() and has been deprecated.

Kind Regards
Niels


Jody Garnett

Hi Jody,

Thanks for your review.

On 17-12-15 00:36, Jody Garnett wrote:

- architecture reason: I was trying to have ResourceStore be the single source of truth for resources (including change notification). Is there a reason why this should change?

Yes and no.

There is a very clear requirement behind this: Resource notifications need to be sent in a clustered environment. (the main purpose of the jdbcstore at this moment is clustered environments. )
So we are left with ONLY these three possible options:
(1) Enforce every one who uses the jdbcstore to use hazelcast (BAD)
(2) Enforce every one who uses hazelcast to use the jdbcstore (BAD)
(3) Add an additional abstraction for the distribution of resource notifications, kept separate from the ResourceStore (which I called ResourceWatcher).
There is no alternative, if you agree that (1) and (2) are bad options, you simply must agree with point (3).

BUT, the default store, the FileSystemBasedStore is not intended to use with clustering (that is why we made a file independent store in the first place). As such, the FileSystemBasedStore has no use for this additional abstraction. But that is no problem: because this isn't being enforced. The ResourceStore already provides the API needed for adding handlers to resources and this remains unaltered, the programmer who uses Resources need not worry about the existence of the ResourceWatcher as they never need to use it. Alternative implementations of ResourceStore may choose to use it, which would be recommended if they wish to support clustering, but that is all.

Two questions:
- If you are really focused on event dispatch can we rename the interface to match?

Sure, ResourceNotificationDispatcher then.

- Should we have an event dispatch mechanism that works in a similar fashion for catalog and resource change events?

Bit confused, resource change events is what we are talking about. Catalog: good point. I don't know if there is a mechanism in place for distributing these in a clustered environment, but if not we do need such a mechanism. It is possible this can be solved with a different mechanism.

2) New utility methods

Resources.toURL( resource )
- makes a lot of sense and reads better in the code

I am not too worried about additional utility methods, instead I am concerned about changes to objects we may have to implement (such as GeoServerResourceLoader).

GeoServerResourceLoader.fromURL(String url)
GeoServerResourceLoader.fromURL(URL url)
- yeah I know you did not write this - just added one that takes a URL (do you really need both?)

Yes, you do. The fromURL(String) method accepts things that would be rejected by a normal URL parser, abolishing it would be a serious regression causing many configurations to fail.

GeoServerResourceLoader.fromPath(String path) {
- I like that it is consistent with the from URL methods...
- These are a bit awkward as they add API to GeoServerResourceLoader (trying to encourage code to just use ResourceStore and utility classes)
- Actually fromPath just delegate to Resources utility class ...

    public Resource fromPath(String path) {
       return Resources.fromPath(path, resources.get(Paths.BASE));
    }

  So no need to add helper method here use:
    Resource r = Resources.fromPath( path, loader.get(Paths.BASE));

  Only thing I can think of is that you are trying to hide the fact we have a DATA_DIRECTORY used as starting point?

These are pure convenience methods. I made them in analogy to the previously already existing GeoServerResourceLoader.url convenience method, which works on the exact same principle and was already there. If you insist they can be removed, but I personally don't see the harm and I find them neater to use than having to call Paths.BASE etc... But it's up to you, say the word.

Regards
Niels

On 18-12-15 16:04, Niels Charlier wrote:

- yeah I know you did not write this - just added one that takes a URL (do you really need both?)

Yes, you do. The fromURL(String) method accepts things that would be rejected by a normal URL parser, abolishing it would be a serious regression causing many configurations to fail.

Oh, I forgot to mention. I did NOT add one that takes a URL. This method was merely moved from another spot (GeoServerDataDirectory). An illogical spot, the only reason for it being there is because it is being used there. None of this URL stuff is new, this is just refactoring, moving things together that belong together and make them consistent.

Regards
Niels

Comments inline on event distribution … and then we should be good to go. If it helps I think we can seperate event notification (generating events/listening for events) from event distribution (in a cluster, recorded in an audit log).

···

On 18 December 2015 at 07:04, Niels Charlier <niels@anonymised.com> wrote:

no problem: because this isn’t being enforced. The ResourceStore already provides the API needed for adding handlers to resources and this remains unaltered, the programmer who uses Resources need not worry about the existence of the ResourceWatcher as they never need to use it. Alternative implementations of ResourceStore may choose to use it, which would be recommended if they wish to support clustering, but that is all.

So the missing bit for me is the hazelcast and jms plugins - they both implement an interface for event distribution. I expected JDBCResoruceStore work to integrate smoothly with that … but had not looked into the details.

Two questions:

  • If you are really focused on event dispatch can we rename the interface to match?

Sure, ResourceNotificationDispatcher then.

Thanks that helps keep my head from exploding.

  • Should we have an event dispatch mechanism that works in a similar fashion for catalog and resource change events?

Bit confused, resource change events is what we are talking about. Catalog: good point. I don’t know if there is a mechanism in place for distributing these in a clustered environment, but if not we do need such a mechanism. It is possible this can be solved with a different mechanism.

There is a mechanism, and two implementations. I would like to combine forces rather than duplicate.

Implementations:

Can you take a moment to look at those and tell me what you think?

And utility methods …

···
  1. New utility methods

Resources.toURL( resource )

  • makes a lot of sense and reads better in the code

I am not too worried about additional utility methods, instead I am concerned about changes to objects we may have to implement (such as GeoServerResourceLoader).

GeoServerResourceLoader.fromURL(String url)
GeoServerResourceLoader.fromURL(URL url)

  • yeah I know you did not write this - just added one that takes a URL (do you really need both?)

Yes, you do. The fromURL(String) method accepts things that would be rejected by a normal URL parser, abolishing it would be a serious regression causing many configurations to fail.

Oh, I forgot to mention. I did NOT add one that takes a URL. This method was merely moved from another spot (GeoServerDataDirectory). An illogical spot, the only reason for it being there is because it is being used there. None of this URL stuff is new, this is just refactoring, moving things together that belong together and make them consistent.

Okay now we are making sense :slight_smile:

Here is the annoying difference in class responsibility (you can also check the javadocs):

  • GeoServerResourceLoader is supposed to obtain a resource from the configuration system (even if that is a JDBC blob).
  • GeoServerDataDirectory is supposed to be the GEOSERVER_DATA_DIRECTORY.

So this is why they were seperate - I like us gradually gathering up all the useful methods in GeoServerResourceLoader. This “fromURL” is always going to be strange because it is used resolve data references

public Resource fromPath(String path) {
return Resources.fromPath(path, resources.get(Paths.BASE));
}

So no need to add helper method here use:
Resource r = Resources.fromPath( path, loader.get(Paths.BASE));

Only thing I can think of is that you are trying to hide the fact we have a DATA_DIRECTORY used as starting point?

These are pure convenience methods. I made them in analogy to the previously already existing GeoServerResourceLoader.url convenience method, which works on the exact same principle and was already there. If you insist they can be removed, but I personally don’t see the harm and I find them neater to use than having to call Paths.BASE etc… But it’s up to you, say the word.

Keep them, but be very clear about what they do :slight_smile:

But really we should chat next month about removing deprecated methods so this API is less complex over time.

I'd suggest merging ResourceLoader and GeoServerDataDirectory. Both of them provide mostly a lot of convenience methods and a little bit of extra logic on top of the ResourceStore. We can remove a lot of the depecrated stuff now.

On 19-12-15 01:43, Jody Garnett wrote:

And utility methods ...

        2) New utility methods

        Resources.toURL( resource )
         - makes a lot of sense and reads better in the code

        I am not too worried about additional utility methods, instead
        I am concerned about changes to objects we may have to
        implement (such as GeoServerResourceLoader).

        GeoServerResourceLoader.fromURL(String url)
        GeoServerResourceLoader.fromURL(URL url)
        - yeah I know you did not write this - just added one that
        takes a URL (do you really need both?)

    Yes, you do. The fromURL(String) method accepts things that would
    be rejected by a normal URL parser, abolishing it would be a
    serious regression causing many configurations to fail.

    Oh, I forgot to mention. I did NOT add one that takes a URL. This
    method was merely moved from another spot
    (GeoServerDataDirectory). An illogical spot, the only reason for
    it being there is because it is being used there. None of this URL
    stuff is new, this is just refactoring, moving things together
    that belong together and make them consistent.

Okay now we are making sense :slight_smile:

Here is the annoying difference in class responsibility (you can also check the javadocs):
* GeoServerResourceLoader is supposed to obtain a resource from the configuration system (even if that is a JDBC blob).
* GeoServerDataDirectory is supposed to *be* the GEOSERVER_DATA_DIRECTORY.

So this is why they were seperate - I like us gradually gathering up all the useful methods in GeoServerResourceLoader. This "fromURL" is always going to be strange because it is used resolve data references

          public Resource fromPath(String path) {
               return Resources.fromPath(path, resources.get(Paths.BASE));
            }

          So no need to add helper method here use:
            Resource r = Resources.fromPath( path,
        loader.get(Paths.BASE));

          Only thing I can think of is that you are trying to hide the
        fact we have a DATA_DIRECTORY used as starting point?

    These are pure convenience methods. I made them in analogy to the
    previously already existing GeoServerResourceLoader.url
    convenience method, which works on the exact same principle and
    was already there. If you insist they can be removed, but I
    personally don't see the harm and I find them neater to use than
    having to call Paths.BASE etc... But it's up to you, say the word.

Keep them, but be very clear about what they do :slight_smile:

But really we should chat next month about removing deprecated methods so this API is less complex over time.

Alright. What the clustering modules do is simply registering itself as a CatalogListener and then dispatch events to other geoserver instances. This is a fine strategy for the catalog, I don't see any reason to change it.

The problem is that with resources, listeners are always placed on a particular resource. I guess there is one way to do it using a similar strategy which I hadn't thought of before (and, admittedly, a fourth option to the list I previously claimed was exhaustive) and that to let the hazelcast module register as a listener to the Root resource (which should receive a notification for every change), rather than implementing a ResourceNotificationDispatcher. However, that would be first of all a much uglier, more round-about and more bug-prone way of doing things than what I did. Furthermore, what I did places code that is likely reusable by other implementations of stores/clusters in a shared space.

Regards
Niels

On 19-12-15 01:03, Jody Garnett wrote:

Comments inline on event distribution ... and then we should be good to go. If it helps I think we can seperate event notification (generating events/listening for events) from event distribution (in a cluster, recorded in an audit log).

On 18 December 2015 at 07:04, Niels Charlier <niels@anonymised.com <mailto:niels@anonymised.com>> wrote:

    no problem: because this isn't being enforced. The ResourceStore
    already provides the API needed for adding handlers to resources
    and this remains unaltered, the programmer who uses Resources need
    not worry about the existence of the ResourceWatcher as they never
    need to use it. Alternative implementations of ResourceStore may
    choose to use it, which would be recommended if they wish to
    support clustering, but that is all.

So the missing bit for me is the hazelcast and jms plugins - they both implement an interface for event distribution. I expected JDBCResoruceStore work to integrate smoothly with that ... but had not looked into the details.

        Two questions:
        - If you are really focused on event dispatch can we rename
        the interface to match?

    Sure, ResourceNotificationDispatcher then.

Thanks that helps keep my head from exploding.

        - Should we have an event dispatch mechanism that works in a
        similar fashion for catalog and resource change events?

    Bit confused, resource change events is what we are talking about.
    Catalog: good point. I don't know if there is a mechanism in place
    for distributing these in a clustered environment, but if not we
    do need such a mechanism. It is possible this can be solved with a
    different mechanism.

There is a mechanism, and two implementations. I would like to combine forces rather than duplicate.

Implementations:

- https://github.com/geoserver/geoserver/tree/master/src/community/hz-cluster
- https://github.com/geoserver/geoserver/tree/master/src/community/jms-cluster

Can you take a moment to look at those and tell me what you think?

Thanks Neils, that approach matches my idea - I had in mind placing a listener on each style directory - and had hoped that we had relatively few other folders to track. I particular I was designing for a clustered environment and had hoped cluster workload could be sharded a bit by workspace (so each node may only end up actively watching a couple style directories…)

I thought I had a single listener list and path matching to handle event notification. The add/remove listener methods on a single resource were an illusion (as resource if often a flyweight object).

public void addListener(ResourceListener listener) {
FileSystemResourceStore.this.addListener( file, path, listener );
}

Which called through to:

public synchronized void addListener(File file, String path, ResourceListener listener) {
if( watcher == null ){
watcher = new FileSystemWatcher();
}
watcher.addListener( file, path, listener );
}

I cannot really see the difference between this and how the other clustering solutions get away with just registering a catalog listener?

I double checked and was a bit disappointed not to find anything more then a catalog listener in the clustering solutions - but I guess it makes sense. Can I turn this question around? Is there any advantage to having a single “cluster event dispatch” that could be used by JDBCConfig, JDBCResourceStore and the clustered WPS implementation?

···

On 21 December 2015 at 02:38, Niels Charlier <niels@anonymised.com> wrote:

Alright. What the clustering modules do is simply registering itself as a CatalogListener and then dispatch events to other geoserver instances. This is a fine strategy for the catalog, I don’t see any reason to change it.

The problem is that with resources, listeners are always placed on a particular resource. I guess there is one way to do it using a similar strategy which I hadn’t thought of before (and, admittedly, a fourth option to the list I previously claimed was exhaustive) and that to let the hazelcast module register as a listener to the Root resource (which should receive a notification for every change), rather than implementing a ResourceNotificationDispatcher. However, that would be first of all a much uglier, more round-about and more bug-prone way of doing things than what I did. Furthermore, what I did places code that is likely reusable by other implementations of stores/clusters in a shared space.

Regards
Niels

On 19-12-15 01:03, Jody Garnett wrote:

Comments inline on event distribution … and then we should be good to go. If it helps I think we can seperate event notification (generating events/listening for events) from event distribution (in a cluster, recorded in an audit log).


Jody Garnett

On 18 December 2015 at 07:04, Niels Charlier <niels@anonymised.com> wrote:

no problem: because this isn’t being enforced. The ResourceStore already provides the API needed for adding handlers to resources and this remains unaltered, the programmer who uses Resources need not worry about the existence of the ResourceWatcher as they never need to use it. Alternative implementations of ResourceStore may choose to use it, which would be recommended if they wish to support clustering, but that is all.

So the missing bit for me is the hazelcast and jms plugins - they both implement an interface for event distribution. I expected JDBCResoruceStore work to integrate smoothly with that … but had not looked into the details.

Two questions:

  • If you are really focused on event dispatch can we rename the interface to match?

Sure, ResourceNotificationDispatcher then.

Thanks that helps keep my head from exploding.

  • Should we have an event dispatch mechanism that works in a similar fashion for catalog and resource change events?

Bit confused, resource change events is what we are talking about. Catalog: good point. I don’t know if there is a mechanism in place for distributing these in a clustered environment, but if not we do need such a mechanism. It is possible this can be solved with a different mechanism.

There is a mechanism, and two implementations. I would like to combine forces rather than duplicate.

Implementations:

Can you take a moment to look at those and tell me what you think?

Sounds excellent, thanks for your hard work.

···

On 21 December 2015 at 02:12, Niels Charlier <niels@anonymised.com> wrote:

I’d suggest merging ResourceLoader and GeoServerDataDirectory. Both of them provide mostly a lot of convenience methods and a little bit of extra logic on top of the ResourceStore. We can remove a lot of the depecrated stuff now.

On 19-12-15 01:43, Jody Garnett wrote:

And utility methods …


Jody Garnett

  1. New utility methods

Resources.toURL( resource )

  • makes a lot of sense and reads better in the code

I am not too worried about additional utility methods, instead I am concerned about changes to objects we may have to implement (such as GeoServerResourceLoader).

GeoServerResourceLoader.fromURL(String url)
GeoServerResourceLoader.fromURL(URL url)

  • yeah I know you did not write this - just added one that takes a URL (do you really need both?)

Yes, you do. The fromURL(String) method accepts things that would be rejected by a normal URL parser, abolishing it would be a serious regression causing many configurations to fail.

Oh, I forgot to mention. I did NOT add one that takes a URL. This method was merely moved from another spot (GeoServerDataDirectory). An illogical spot, the only reason for it being there is because it is being used there. None of this URL stuff is new, this is just refactoring, moving things together that belong together and make them consistent.

Okay now we are making sense :slight_smile:

Here is the annoying difference in class responsibility (you can also check the javadocs):

  • GeoServerResourceLoader is supposed to obtain a resource from the configuration system (even if that is a JDBC blob).
  • GeoServerDataDirectory is supposed to be the GEOSERVER_DATA_DIRECTORY.

So this is why they were seperate - I like us gradually gathering up all the useful methods in GeoServerResourceLoader. This “fromURL” is always going to be strange because it is used resolve data references

public Resource fromPath(String path) {
return Resources.fromPath(path, resources.get(Paths.BASE));
}

So no need to add helper method here use:
Resource r = Resources.fromPath( path, loader.get(Paths.BASE));

Only thing I can think of is that you are trying to hide the fact we have a DATA_DIRECTORY used as starting point?

These are pure convenience methods. I made them in analogy to the previously already existing GeoServerResourceLoader.url convenience method, which works on the exact same principle and was already there. If you insist they can be removed, but I personally don’t see the harm and I find them neater to use than having to call Paths.BASE etc… But it’s up to you, say the word.

Keep them, but be very clear about what they do :slight_smile:

But really we should chat next month about removing deprecated methods so this API is less complex over time.

Jody,

have your questions been addressed to your satisfaction? Are we ready to merge PR #1361 as it is now?
https://github.com/geoserver/geoserver/pull/1361

Kind regards,
Ben.

On 22/12/15 15:06, Jody Garnett wrote:

Sounds excellent, thanks for your hard work.

--
Ben Caradoc-Davies <ben@anonymised.com>
Director
Transient Software Limited <http://transient.nz/&gt;
New Zealand

Looks like the methods have been addressed, still circling a bit on the event notification.

What is your impression Niels?

···

On 21 December 2015 at 18:26, Ben Caradoc-Davies <ben@anonymised.com> wrote:

Jody,

have your questions been addressed to your satisfaction? Are we ready to merge PR #1361 as it is now?
https://github.com/geoserver/geoserver/pull/1361

Kind regards,
Ben.

On 22/12/15 15:06, Jody Garnett wrote:

Sounds excellent, thanks for your hard work.


Ben Caradoc-Davies <ben@anonymised.com>
Director
Transient Software Limited <http://transient.nz/>
New Zealand


Jody Garnett

Hi Jody,

The requirement is that if a resource listener is placed on a resource, any change to that resource should be notified, also if that change occurred in another instance of Geoserver in a clustered environment. Just like the FileBasedResources notify all changes to a file. I would not expect the clustering modules to only synchronise style changes and ignore the rest of resources. The idea is that if someone changes any configuration file, that not all instances must be restarted.

Even if listeners being attached to resources is an illusion (that is the implementation for FileBasedResources), they are not API-wise. The clustering modules must use the available ResourceStore API, irrespective of implementation, which attaches listeners to resources.

As I said, we could be listenening to the root folder of the resource store in order to get notified of all changes. However that does not solve the whole problem. The clustering modules would then be responsible for notifying all the appropriate resources of events they received from other instances. This is not possible with the current API. So we would still need additional API to do this. I feel that would result in something very messy.

I am still convinced that what I did is the cleanest and most elegant solution to the problem.

Can I turn this question around? Is there any advantage to having a single "cluster event dispatch" that could be used by JDBCConfig, JDBCResourceStore and the clustered WPS implementation?

I guess so, but I'd need to see better illustrated what you mean by that, and what you wish to accomplish, to answer that.

Kind Regards
Niels

The requirement is that if a resource listener is placed on a resource,
any change to that resource should be notified, also if that change
occurred in another instance of Geoserver in a clustered environment. Just
like the FileBasedResources notify all changes to a file. I would not
expect the clustering modules to only synchronise style changes and ignore
the rest of resources. The idea is that if someone changes any
configuration file, that not all instances must be restarted.

Think you are missing the idea - in most cases these resources are not
unpacked on disk and are retrieved by JDBCResourceStore. I do not know what
that will do to performance - but that is why JDBCResourceStore is a
community module. If we need to cache the bytes for small resources then
those nodes would indeed need to listen for events to keep their cache in
sync.

So what does have to listen for events? Only a few things (templates,
security files, ...) if you would like to search the code we can have a
better conversation.

Even if listeners being attached to resources is an illusion (that is the

implementation for FileBasedResources), they are not API-wise.

I had started with a single place to register for events, and refactored as
you see based on making the code examples easier to use. But I guess you
are correct - this approach is my own recommendation - as an implementor
you can choose to implement heavyweight resource objects (but we do not
have any dispose() methods since that was not the plan).

The clustering modules must use the available ResourceStore API,

irrespective of implementation, which attaches listeners to resources.

I do not see how that matters, you are defining new API which could be
injected into the ResourceStore implementation if needed.

As I said, we could be listenening to the root folder of the resource
store in order to get notified of all changes. However that does not solve
the whole problem. The clustering modules would then be responsible for
notifying all the appropriate resources of events they received from other
instances. This is not possible with the current API. So we would still
need additional API to do this. I feel that would result in something very
messy.

I am still convinced that what I did is the cleanest and most elegant
solution to the problem.

I am not unconvinced Niels, just trying to make sure we plan for the larger
picture. See the next question ...

Can I turn this question around? Is there any advantage to having a single

"cluster event dispatch" that could be used by JDBCConfig,
JDBCResourceStore and the clustered WPS implementation?

I guess so, but I'd need to see better illustrated what you mean by that,
and what you wish to accomplish, to answer that.

I am trying to avoid duplicating event notification across a cluster three
times.

TLDR: I am okay with your approach, if we have a larger design to work
towards, but not if we are only making clustering harder and harder to pull
off in the future.

On 22-12-15 23:00, Jody Garnett wrote:

Think you are missing the idea - in most cases these resources are not unpacked on disk and are retrieved by JDBCResourceStore. I do not know what that will do to performance - but that is why JDBCResourceStore is a community module. If we need to cache the bytes for small resources then those nodes would indeed need to listen for events to keep their cache in sync.

So what does have to listen for events? Only a few things (templates, security files, ...) if you would like to search the code we can have a better conversation.

I think we need to be more on the same page on that which we want to accomplish. What Gabriel and I wanted to accomplish in this project, is to make Resource.addListener() work for jdbcstore as expected, under all circumstances. Not to keep anything particular in sync: that is the job of the module developers.
The developers must be sure that if they call Resource.addListener, they will be notified of any changes in the resource: code that works for the default store MUST also work for the jdbc store, that is the aim. When they use the file system store, they know they will be notified if someone edits the file on disk. When using the jdbcstore, they must also be notified of any change. We decided to assume nobody will edit the files in the db directly, so that we only have to deal with the issue of multiple instances.

I had started with a single place to register for events, and refactored as you see based on making the code examples easier to use. But I guess you are correct - this approach is my own recommendation - as an implementor you can choose to implement heavyweight resource objects (but we do not have any dispose() methods since that was not the plan).

Heavyweight resource objects would be nearly impossible with the existing API, but my point was only that the single place to register events is encapsulated by the ResourceStore API and thus inaccessible to other modules. The clustering modules do not have access to it, if limited to using the ResourceStore API. So this fact remains irrelevant as far as the implementation of these modules is concerned.

I do not see how that matters, you are defining new API which could be injected into the ResourceStore implementation if needed.

Indeed, and I was explaining why because you questioned the fact I am doing that :slight_smile:

I am trying to avoid duplicating event notification across a cluster three times.

That is absolutely impossible in my implementation. And in fact, that would only have been a risk if we went for a listening-and-redispatching strategy. That is one reason I called it error-prone.

Regards
Niels

G’Day Niels, sorry it has taken me until the new year to return to this conversation.

We discussed this pull request in the last geoserver meeting:

  • Andrea is creating a draft pull request review guide page to try and prevent us tripping up this way in the future.
  • And I have created a proposal GSIP-136 for the api change you outline in your pull request.

I also had a chance to talk to Gabriel about an event notification strategy. He echoed your words that doing this consistently was an known unsolved problem - so I have creating a technical debt page with some feedback from Kevin on GWC.

···

On 23 December 2015 at 05:12, Niels Charlier <niels@anonymised.com> wrote:

On 22-12-15 23:00, Jody Garnett wrote:

Think you are missing the idea - in most cases these resources are not unpacked on disk and are retrieved by JDBCResourceStore. I do not know what that will do to performance - but that is why JDBCResourceStore is a community module. If we need to cache the bytes for small resources then those nodes would indeed need to listen for events to keep their cache in sync.

So what does have to listen for events? Only a few things (templates, security files, …) if you would like to search the code we can have a better conversation.

I think we need to be more on the same page on that which we want to accomplish. What Gabriel and I wanted to accomplish in this project, is to make Resource.addListener() work for jdbcstore as expected, under all circumstances. Not to keep anything particular in sync: that is the job of the module developers.
The developers must be sure that if they call Resource.addListener, they will be notified of any changes in the resource: code that works for the default store MUST also work for the jdbc store, that is the aim. When they use the file system store, they know they will be notified if someone edits the file on disk. When using the jdbcstore, they must also be notified of any change. We decided to assume nobody will edit the files in the db directly, so that we only have to deal with the issue of multiple instances.

I had started with a single place to register for events, and refactored as you see based on making the code examples easier to use. But I guess you are correct - this approach is my own recommendation - as an implementor you can choose to implement heavyweight resource objects (but we do not have any dispose() methods since that was not the plan).

Heavyweight resource objects would be nearly impossible with the existing API, but my point was only that the single place to register events is encapsulated by the ResourceStore API and thus inaccessible to other modules. The clustering modules do not have access to it, if limited to using the ResourceStore API. So this fact remains irrelevant as far as the implementation of these modules is concerned.

I do not see how that matters, you are defining new API which could be injected into the ResourceStore implementation if needed.

Indeed, and I was explaining why because you questioned the fact I am doing that :slight_smile:

I am trying to avoid duplicating event notification across a cluster three times.

That is absolutely impossible in my implementation. And in fact, that would only have been a risk if we went for a listening-and-redispatching strategy. That is one reason I called it error-prone.

Regards
Niels


Jody Garnett