On Thu, May 28, 2020 at 5:10 PM Trond Myklebust <trondmy@xxxxxxxxxxxxxxx> wrote: > > Hi Olga, > > On Thu, 2020-05-28 at 16:42 -0400, Olga Kornievskaia wrote: > > Hi folks, > > > > Looking for recommendation on what the client is suppose to be doing > > in the following situation. Client opens a file and has a byte-range > > lock which returned a locking state. Client is acquiring another byte > > range lock. It uses the returned locking stated for the 2nd lock. > > Server returns ADMIN_REVOKED. > > > > Currently the client goes into an infinite loop of just resending the > > same LOCK operation with > > the same locking stateid. > > > > Is this a recoverable situation? The fact that the lock state was > > revoked, should it be an automatic EIO since previous lock is lost > > (so > > why bother going forward)? Or should the client retry the lock but > > send it with the open stateid? > > > > Thank you. > > I think the right behaviour should be to just call > nfs_inode_find_state_and_recover(). In principle that will end up > either recovering the lock (if the user set the nfs.recover_lost_locks > kernel parameter to 'true') or marking it as a lost lock, using > NFS_LOCK_LOST. Why should acquiring of the 2nd lock depend on recovering the lock who's stateid it was trying to use? I think the 1st stateid is lost unrecoverable? Right now what happens is code initiates recovery. open is sent. But the retry of the 2nd lock has the INITIALIZED_LOCK set and so it takes the bad lock stateid (how about instead letting it use the recovered open stateid?). How about instead do the follow. diff --git a/fs/nfs/nfs4_fs.h b/fs/nfs/nfs4_fs.h index 2b7f6dc..7723bd7 100644 --- a/fs/nfs/nfs4_fs.h +++ b/fs/nfs/nfs4_fs.h @@ -147,6 +147,7 @@ struct nfs4_lock_state { struct nfs4_state * ls_state; /* Pointer to open state */ #define NFS_LOCK_INITIALIZED 0 #define NFS_LOCK_LOST 1 +#define NFS_LOCK_REVOKED 2 unsigned long ls_flags; struct nfs_seqid_counter ls_seqid; nfs4_stateid ls_stateid; diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c index 9056f3d..6769370 100644 --- a/fs/nfs/nfs4proc.c +++ b/fs/nfs/nfs4proc.c @@ -6792,7 +6792,8 @@ static void nfs4_lock_prepare(struct rpc_task *task, void if (nfs_wait_on_sequence(data->arg.lock_seqid, task) != 0) goto out_wait; /* Do we need to do an open_to_lock_owner? */ - if (!test_bit(NFS_LOCK_INITIALIZED, &data->lsp->ls_flags)) { + if (!test_bit(NFS_LOCK_INITIALIZED, &data->lsp->ls_flags) || + test_and_clear_bit(NFS_LOCK_REVOKED, &data->lsp->ls_flags)) { if (nfs_wait_on_sequence(data->arg.open_seqid, task) != 0) { goto out_release_lock_seqid; } Alternatively I have also thought about clearing the NFS_LOCK_INITIALIZED in the nfs4_handle_setlk_error(). It leads to the 1st lock being marked lost (and having a message in the log) and then 2nd lock can proceed. diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c index 9056f3d..2c78464 100644 --- a/fs/nfs/nfs4proc.c +++ b/fs/nfs/nfs4proc.c @@ -6908,8 +6908,10 @@ static void nfs4_handle_setlk_error(struct nfs_server *server, struct nfs4_lock_ case -NFS4ERR_BAD_STATEID: lsp->ls_seqid.flags &= ~NFS_SEQID_CONFIRMED; if (new_lock_owner != 0 || - test_bit(NFS_LOCK_INITIALIZED, &lsp->ls_flags) != 0) + test_bit(NFS_LOCK_INITIALIZED, &lsp->ls_flags) != 0) { nfs4_schedule_stateid_recovery(server, lsp->ls_state); + clear_bif(NFS_LOCK_INITIALIZED, &lsp->ls_flags); + } break; case -NFS4ERR_STALE_STATEID: lsp->ls_seqid.flags &= ~NFS_SEQID_CONFIRMED; > > Cheers > Trond > -- > Trond Myklebust > Linux NFS client maintainer, Hammerspace > trond.myklebust@xxxxxxxxxxxxxxx > >