[Geoserver-devel] gdal_translate config file and resource locks

Quick question Niels since we were discussing Resource and LockProvider the other day.

I am reviewing the goal_translate pull request, which uses a watcher to monitor a Resource for changes. Inside the code there is a read/write lock to protect access during a resource change event.

The watcher is just a short cut for looking out for resource changed events. I was wondering if the resource lock method should be used instead - the logic being that the LockProvider is configured under admin control and may be correctly configured for resource change events.

I expect the pull request is fine as written, we will need to carefully consider what order to apply event notification, resource.lock(), and code monitors or read/write locks to prevent deadlock in a clustered environment.

It would be good to highlight the use of a watcher like this, and see if it should migrate to direct ResourceListener use and if a Resource lock can be used. My rough idea is:

  1. resource lock when writing the change
  2. resource lock release when the event change is complete, followed by event notification to nodes in the cluster
  3. resource listeners (including any file watchers) are notified and read the contents. If they have an internal read/write lock on the resulting data structure it should not introduce deadlock

Key to the above being successful is ensuring that code such as dal_translate does not both hold a resource lock and their data structure read/write lock at the same time.

Let me know if the above makes sense and we can highlight it in the developers guide.

···


Jody Garnett

Hi Jody,

Makes sense. It looks to me like the PR does use the ResourceListener rather than FileWatcher.

Ideally, the resource.lock() should also be used to have a uniform API for locking.

Regards
Niels

On 05-08-15 19:53, Jody Garnett wrote:

Quick question Niels since we were discussing Resource and LockProvider the other day.

I am reviewing the goal_translate pull request, which uses a watcher to monitor a Resource for changes. Inside the code there is a read/write lock to protect access during a resource change event.

The watcher is just a short cut for looking out for resource changed events. I was wondering if the resource lock method should be used instead - the logic being that the LockProvider is configured under admin control and may be correctly configured for resource change events.

I expect the pull request is fine as written, we will need to carefully consider what order to apply event notification, resource.lock(), and code monitors or read/write locks to prevent deadlock in a clustered environment.

It would be good to highlight the use of a watcher like this, and see if it should migrate to direct ResourceListener use and if a Resource lock can be used. My rough idea is:

1. resource lock when writing the change
2. resource lock release when the event change is complete, followed by event notification to nodes in the cluster
3. resource listeners (including any file watchers) are notified and read the contents. If they have an internal read/write lock on the resulting data structure it should not introduce deadlock

Key to the above being successful is ensuring that code such as dal_translate does not both hold a resource lock and their data structure read/write lock at the same time.

Let me know if the above makes sense and we can highlight it in the developers guide.

--
Jody Garnett

Yeah. I considered making input stream wrappers that automatically used the read lock, and then released it on close(). Not sure if I did not successfully or not - you are welcome to review.

Point being that locks are involved in resource access; so me may need to look out for deadlock.

···

On 6 August 2015 at 04:24, Niels Charlier <niels@anonymised.com> wrote:

Hi Jody,

Makes sense. It looks to me like the PR does use the ResourceListener rather than FileWatcher.

Ideally, the resource.lock() should also be used to have a uniform API for locking.

Regards
Niels

On 05-08-15 19:53, Jody Garnett wrote:

Quick question Niels since we were discussing Resource and LockProvider the other day.

I am reviewing the goal_translate pull request, which uses a watcher to monitor a Resource for changes. Inside the code there is a read/write lock to protect access during a resource change event.

The watcher is just a short cut for looking out for resource changed events. I was wondering if the resource lock method should be used instead - the logic being that the LockProvider is configured under admin control and may be correctly configured for resource change events.

I expect the pull request is fine as written, we will need to carefully consider what order to apply event notification, resource.lock(), and code monitors or read/write locks to prevent deadlock in a clustered environment.

It would be good to highlight the use of a watcher like this, and see if it should migrate to direct ResourceListener use and if a Resource lock can be used. My rough idea is:

  1. resource lock when writing the change
  2. resource lock release when the event change is complete, followed by event notification to nodes in the cluster
  3. resource listeners (including any file watchers) are notified and read the contents. If they have an internal read/write lock on the resulting data structure it should not introduce deadlock

Key to the above being successful is ensuring that code such as dal_translate does not both hold a resource lock and their data structure read/write lock at the same time.

Let me know if the above makes sense and we can highlight it in the developers guide.


Jody Garnett


Jody Garnett

It should always be avoided to lock more than one resource at a time. Locking should only happen for a short period of time.

I agree with your proposal.

On 06-08-15 18:04, Jody Garnett wrote:

Yeah. I considered making input stream wrappers that automatically used the read lock, and then released it on close(). Not sure if I did not successfully or not - you are welcome to review.

Point being that locks are involved in resource access; so me may need to look out for deadlock.

--
Jody Garnett

On 6 August 2015 at 04:24, Niels Charlier <niels@anonymised.com <mailto:niels@anonymised.com>> wrote:

    Hi Jody,

    Makes sense. It looks to me like the PR does use the
    ResourceListener rather than FileWatcher.

    Ideally, the resource.lock() should also be used to have a uniform
    API for locking.

    Regards
    Niels

    On 05-08-15 19:53, Jody Garnett wrote:

        Quick question Niels since we were discussing Resource and
        LockProvider the other day.

        I am reviewing the goal_translate pull request, which uses a
        watcher to monitor a Resource for changes. Inside the code
        there is a read/write lock to protect access during a resource
        change event.

        The watcher is just a short cut for looking out for resource
        changed events. I was wondering if the resource lock method
        should be used instead - the logic being that the LockProvider
        is configured under admin control and may be correctly
        configured for resource change events.

        I expect the pull request is fine as written, we will need to
        carefully consider what order to apply event notification,
        resource.lock(), and code monitors or read/write locks to
        prevent deadlock in a clustered environment.

        It would be good to highlight the use of a watcher like this,
        and see if it should migrate to direct ResourceListener use
        and if a Resource lock can be used. My rough idea is:

        1. resource lock when writing the change
        2. resource lock release when the event change is complete,
        followed by event notification to nodes in the cluster
        3. resource listeners (including any file watchers) are
        notified and read the contents. If they have an internal
        read/write lock on the resulting data structure it should not
        introduce deadlock

        Key to the above being successful is ensuring that code such
        as dal_translate does not both hold a resource lock and their
        data structure read/write lock at the same time.

        Let me know if the above makes sense and we can highlight it
        in the developers guide.

        --
        Jody Garnett

Not a proposal - you can review the code for in() here:

https://github.com/geoserver/geoserver/blob/master/src/platform/src/main/java/org/geoserver/platform/resource/FileSystemResourceStore.java#L185

It is also worth checking the out() - it uses locks and writes to a temporary file, the close() method copies the file into position once it is completely written out.

This should mirror your workflow for a JDBCConfig ResourceStore implementation.

···

On 6 August 2015 at 09:16, Niels Charlier <niels@anonymised.com> wrote:

It should always be avoided to lock more than one resource at a time. Locking should only happen for a short period of time.

I agree with your proposal.

On 06-08-15 18:04, Jody Garnett wrote:

Yeah. I considered making input stream wrappers that automatically used the read lock, and then released it on close(). Not sure if I did not successfully or not - you are welcome to review.

Point being that locks are involved in resource access; so me may need to look out for deadlock.


Jody Garnett


Jody Garnett

On 6 August 2015 at 04:24, Niels Charlier <niels@anonymised.com> wrote:

Hi Jody,

Makes sense. It looks to me like the PR does use the ResourceListener rather than FileWatcher.

Ideally, the resource.lock() should also be used to have a uniform API for locking.

Regards
Niels

On 05-08-15 19:53, Jody Garnett wrote:

Quick question Niels since we were discussing Resource and LockProvider the other day.

I am reviewing the goal_translate pull request, which uses a watcher to monitor a Resource for changes. Inside the code there is a read/write lock to protect access during a resource change event.

The watcher is just a short cut for looking out for resource changed events. I was wondering if the resource lock method should be used instead - the logic being that the LockProvider is configured under admin control and may be correctly configured for resource change events.

I expect the pull request is fine as written, we will need to carefully consider what order to apply event notification, resource.lock(), and code monitors or read/write locks to prevent deadlock in a clustered environment.

It would be good to highlight the use of a watcher like this, and see if it should migrate to direct ResourceListener use and if a Resource lock can be used. My rough idea is:

  1. resource lock when writing the change
  2. resource lock release when the event change is complete, followed by event notification to nodes in the cluster
  3. resource listeners (including any file watchers) are notified and read the contents. If they have an internal read/write lock on the resulting data structure it should not introduce deadlock

Key to the above being successful is ensuring that code such as dal_translate does not both hold a resource lock and their data structure read/write lock at the same time.

Let me know if the above makes sense and we can highlight it in the developers guide.


Jody Garnett