Re: How to handle revocation of locking state

Olga Kornievskaia <aglo@xxxxxxxxx> · Thu, 28 May 2020 17:43:19 -0400

On Thu, May 28, 2020 at 5:10 PM Trond Myklebust <trondmy@xxxxxxxxxxxxxxx> wrote:
>
> Hi Olga,
>
> On Thu, 2020-05-28 at 16:42 -0400, Olga Kornievskaia wrote:
> > Hi folks,
> >
> > Looking for recommendation on what the client is suppose to be doing
> > in the following situation. Client opens a file and has a byte-range
> > lock which returned a locking state. Client is acquiring another byte
> > range lock. It uses the returned locking stated for the 2nd lock.
> > Server returns ADMIN_REVOKED.
> >
> > Currently the client goes into an infinite loop of just resending the
> > same LOCK operation with
> > the same locking stateid.
> >
> > Is this a recoverable situation? The fact that the lock state was
> > revoked, should it be an automatic EIO since previous lock is lost
> > (so
> > why bother going forward)? Or should the client retry the lock but
> > send it with the open stateid?
> >
> > Thank you.
>
> I think the right behaviour should be to just call
> nfs_inode_find_state_and_recover(). In principle that will end up
> either recovering the lock (if the user set the nfs.recover_lost_locks
> kernel parameter to 'true') or marking it as a lost lock, using
> NFS_LOCK_LOST.

Why should acquiring of the 2nd lock depend on recovering the lock
who's stateid it was trying to use? I think the 1st stateid is lost
unrecoverable?

Right now what happens is code initiates recovery. open is sent. But
the retry of the 2nd lock has the INITIALIZED_LOCK set and so it takes
the bad lock stateid (how about instead letting it use the recovered
open stateid?). How about instead do the follow.

diff --git a/fs/nfs/nfs4_fs.h b/fs/nfs/nfs4_fs.h
index 2b7f6dc..7723bd7 100644
--- a/fs/nfs/nfs4_fs.h
+++ b/fs/nfs/nfs4_fs.h
@@ -147,6 +147,7 @@ struct nfs4_lock_state {
        struct nfs4_state *     ls_state;       /* Pointer to open state */
 #define NFS_LOCK_INITIALIZED 0
 #define NFS_LOCK_LOST        1
+#define NFS_LOCK_REVOKED     2
        unsigned long           ls_flags;
        struct nfs_seqid_counter        ls_seqid;
        nfs4_stateid            ls_stateid;
diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
index 9056f3d..6769370 100644
--- a/fs/nfs/nfs4proc.c
+++ b/fs/nfs/nfs4proc.c
@@ -6792,7 +6792,8 @@ static void nfs4_lock_prepare(struct rpc_task *task, void
        if (nfs_wait_on_sequence(data->arg.lock_seqid, task) != 0)
                goto out_wait;
        /* Do we need to do an open_to_lock_owner? */
-       if (!test_bit(NFS_LOCK_INITIALIZED, &data->lsp->ls_flags)) {
+       if (!test_bit(NFS_LOCK_INITIALIZED, &data->lsp->ls_flags) ||
+               test_and_clear_bit(NFS_LOCK_REVOKED, &data->lsp->ls_flags)) {
                if (nfs_wait_on_sequence(data->arg.open_seqid, task) != 0) {
                        goto out_release_lock_seqid;
                }

Alternatively I have also thought about clearing the
NFS_LOCK_INITIALIZED in the nfs4_handle_setlk_error(). It leads to the
1st lock being marked lost (and having a message in the log) and then
2nd lock can proceed.

diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
index 9056f3d..2c78464 100644
--- a/fs/nfs/nfs4proc.c
+++ b/fs/nfs/nfs4proc.c
@@ -6908,8 +6908,10 @@ static void nfs4_handle_setlk_error(struct
nfs_server *server, struct nfs4_lock_
        case -NFS4ERR_BAD_STATEID:
                lsp->ls_seqid.flags &= ~NFS_SEQID_CONFIRMED;
                if (new_lock_owner != 0 ||
-                  test_bit(NFS_LOCK_INITIALIZED, &lsp->ls_flags) != 0)
+                  test_bit(NFS_LOCK_INITIALIZED, &lsp->ls_flags) != 0) {
                        nfs4_schedule_stateid_recovery(server, lsp->ls_state);
+                       clear_bif(NFS_LOCK_INITIALIZED, &lsp->ls_flags);
+               }
                break;
        case -NFS4ERR_STALE_STATEID:
                lsp->ls_seqid.flags &= ~NFS_SEQID_CONFIRMED;

>
> Cheers
>   Trond
> --
> Trond Myklebust
> Linux NFS client maintainer, Hammerspace
> trond.myklebust@xxxxxxxxxxxxxxx
>
>