Re: Patch "nfsd: fix RELEASE_LOCKOWNER" has been added to the 6.1-stable tree

Chuck Lever <chuck.lever@xxxxxxxxxx> · Sun, 28 Jan 2024 19:38:47 -0500

On Sun, Jan 28, 2024 at 07:21:26PM -0500, Sasha Levin wrote:
> This is a note to let you know that I've just added the patch titled
> 
>     nfsd: fix RELEASE_LOCKOWNER
> 
> to the 6.1-stable tree which can be found at:
>     http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=summary
> 
> The filename of the patch is:
>      nfsd-fix-release_lockowner.patch
> and it can be found in the queue-6.1 subdirectory.
> 
> If you, or anyone else, feels it should not be added to the stable tree,
> please let <stable@xxxxxxxxxxxxxxx> know about it.

Hello Sasha -

I already asked Greg not to apply this patch to kernels
before v6.2.

See below:

>  Cc: stable@xxxxxxxxxxxxxxx # v6.2+

Was intended to block backports of edcf9725 to kernels before
v6.2 so that we could thoroughly test the work first.

I've seen this kind of annotation used before, but apparently
it is not effective. What is the correct way to label a patch
so that it will be backported only to certain kernels?

> commit 2073110fe8693c6bc6a3cfc5f198fe444b5c2397
> Author: NeilBrown <neilb@xxxxxxx>
> Date:   Mon Jan 22 14:58:16 2024 +1100
> 
>     nfsd: fix RELEASE_LOCKOWNER
>     
>     [ Upstream commit edcf9725150e42beeca42d085149f4c88fa97afd ]
>     
>     The test on so_count in nfsd4_release_lockowner() is nonsense and
>     harmful.  Revert to using check_for_locks(), changing that to not sleep.
>     
>     First: harmful.
>     As is documented in the kdoc comment for nfsd4_release_lockowner(), the
>     test on so_count can transiently return a false positive resulting in a
>     return of NFS4ERR_LOCKS_HELD when in fact no locks are held.  This is
>     clearly a protocol violation and with the Linux NFS client it can cause
>     incorrect behaviour.
>     
>     If RELEASE_LOCKOWNER is sent while some other thread is still
>     processing a LOCK request which failed because, at the time that request
>     was received, the given owner held a conflicting lock, then the nfsd
>     thread processing that LOCK request can hold a reference (conflock) to
>     the lock owner that causes nfsd4_release_lockowner() to return an
>     incorrect error.
>     
>     The Linux NFS client ignores that NFS4ERR_LOCKS_HELD error because it
>     never sends NFS4_RELEASE_LOCKOWNER without first releasing any locks, so
>     it knows that the error is impossible.  It assumes the lock owner was in
>     fact released so it feels free to use the same lock owner identifier in
>     some later locking request.
>     
>     When it does reuse a lock owner identifier for which a previous RELEASE
>     failed, it will naturally use a lock_seqid of zero.  However the server,
>     which didn't release the lock owner, will expect a larger lock_seqid and
>     so will respond with NFS4ERR_BAD_SEQID.
>     
>     So clearly it is harmful to allow a false positive, which testing
>     so_count allows.
>     
>     The test is nonsense because ... well... it doesn't mean anything.
>     
>     so_count is the sum of three different counts.
>     1/ the set of states listed on so_stateids
>     2/ the set of active vfs locks owned by any of those states
>     3/ various transient counts such as for conflicting locks.
>     
>     When it is tested against '2' it is clear that one of these is the
>     transient reference obtained by find_lockowner_str_locked().  It is not
>     clear what the other one is expected to be.
>     
>     In practice, the count is often 2 because there is precisely one state
>     on so_stateids.  If there were more, this would fail.
>     
>     In my testing I see two circumstances when RELEASE_LOCKOWNER is called.
>     In one case, CLOSE is called before RELEASE_LOCKOWNER.  That results in
>     all the lock states being removed, and so the lockowner being discarded
>     (it is removed when there are no more references which usually happens
>     when the lock state is discarded).  When nfsd4_release_lockowner() finds
>     that the lock owner doesn't exist, it returns success.
>     
>     The other case shows an so_count of '2' and precisely one state listed
>     in so_stateid.  It appears that the Linux client uses a separate lock
>     owner for each file resulting in one lock state per lock owner, so this
>     test on '2' is safe.  For another client it might not be safe.
>     
>     So this patch changes check_for_locks() to use the (newish)
>     find_any_file_locked() so that it doesn't take a reference on the
>     nfs4_file and so never calls nfsd_file_put(), and so never sleeps.  With
>     this check is it safe to restore the use of check_for_locks() rather
>     than testing so_count against the mysterious '2'.
>     
>     Fixes: ce3c4ad7f4ce ("NFSD: Fix possible sleep during nfsd4_release_lockowner()")
>     Signed-off-by: NeilBrown <neilb@xxxxxxx>
>     Reviewed-by: Jeff Layton <jlayton@xxxxxxxxxx>
>     Cc: stable@xxxxxxxxxxxxxxx # v6.2+
>     Signed-off-by: Chuck Lever <chuck.lever@xxxxxxxxxx>
>     Signed-off-by: Sasha Levin <sashal@xxxxxxxxxx>
> 
> diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
> index faecdbfa01a2..0443fe4e29e1 100644
> --- a/fs/nfsd/nfs4state.c
> +++ b/fs/nfsd/nfs4state.c
> @@ -7736,14 +7736,16 @@ check_for_locks(struct nfs4_file *fp, struct nfs4_lockowner *lowner)
>  {
>  	struct file_lock *fl;
>  	int status = false;
> -	struct nfsd_file *nf = find_any_file(fp);
> +	struct nfsd_file *nf;
>  	struct inode *inode;
>  	struct file_lock_context *flctx;
>  
> +	spin_lock(&fp->fi_lock);
> +	nf = find_any_file_locked(fp);
>  	if (!nf) {
>  		/* Any valid lock stateid should have some sort of access */
>  		WARN_ON_ONCE(1);
> -		return status;
> +		goto out;
>  	}
>  
>  	inode = locks_inode(nf->nf_file);
> @@ -7759,7 +7761,8 @@ check_for_locks(struct nfs4_file *fp, struct nfs4_lockowner *lowner)
>  		}
>  		spin_unlock(&flctx->flc_lock);
>  	}
> -	nfsd_file_put(nf);
> +out:
> +	spin_unlock(&fp->fi_lock);
>  	return status;
>  }
>  
> @@ -7769,10 +7772,8 @@ check_for_locks(struct nfs4_file *fp, struct nfs4_lockowner *lowner)
>   * @cstate: NFSv4 COMPOUND state
>   * @u: RELEASE_LOCKOWNER arguments
>   *
> - * The lockowner's so_count is bumped when a lock record is added
> - * or when copying a conflicting lock. The latter case is brief,
> - * but can lead to fleeting false positives when looking for
> - * locks-in-use.
> + * Check if theree are any locks still held and if not - free the lockowner
> + * and any lock state that is owned.
>   *
>   * Return values:
>   *   %nfs_ok: lockowner released or not found
> @@ -7808,10 +7809,13 @@ nfsd4_release_lockowner(struct svc_rqst *rqstp,
>  		spin_unlock(&clp->cl_lock);
>  		return nfs_ok;
>  	}
> -	if (atomic_read(&lo->lo_owner.so_count) != 2) {
> -		spin_unlock(&clp->cl_lock);
> -		nfs4_put_stateowner(&lo->lo_owner);
> -		return nfserr_locks_held;
> +
> +	list_for_each_entry(stp, &lo->lo_owner.so_stateids, st_perstateowner) {
> +		if (check_for_locks(stp->st_stid.sc_file, lo)) {
> +			spin_unlock(&clp->cl_lock);
> +			nfs4_put_stateowner(&lo->lo_owner);
> +			return nfserr_locks_held;
> +		}
>  	}
>  	unhash_lockowner_locked(lo);
>  	while (!list_empty(&lo->lo_owner.so_stateids)) {

-- 
Chuck Lever