Re: [PATCH] nfsd: fix RELEASE_LOCKOWNER

"NeilBrown" <neilb@xxxxxxx> · Tue, 23 Jan 2024 08:57:15 +1100

On Tue, 23 Jan 2024, Chuck Lever wrote:
> On Mon, Jan 22, 2024 at 02:58:16PM +1100, NeilBrown wrote:
> > 
> > The test on so_count in nfsd4_release_lockowner() is nonsense and
> > harmful.  Revert to using check_for_locks(), changing that to not sleep.
> > 
> > First: harmful.
> > As is documented in the kdoc comment for nfsd4_release_lockowner(), the
> > test on so_count can transiently return a false positive resulting in a
> > return of NFS4ERR_LOCKS_HELD when in fact no locks are held.  This is
> > clearly a protocol violation and with the Linux NFS client it can cause
> > incorrect behaviour.
> > 
> > If NFS4_RELEASE_LOCKOWNER is sent while some other thread is still
> > processing a LOCK request which failed because, at the time that request
> > was received, the given owner held a conflicting lock, then the nfsd
> > thread processing that LOCK request can hold a reference (conflock) to
> > the lock owner that causes nfsd4_release_lockowner() to return an
> > incorrect error.
> > 
> > The Linux NFS client ignores that NFS4ERR_LOCKS_HELD error because it
> > never sends NFS4_RELEASE_LOCKOWNER without first releasing any locks, so
> > it knows that the error is impossible.  It assumes the lock owner was in
> > fact released so it feels free to use the same lock owner identifier in
> > some later locking request.
> > 
> > When it does reuse a lock owner identifier for which a previous RELEASE
> > failed, it will naturally use a lock_seqid of zero.  However the server,
> > which didn't release the lock owner, will expect a larger lock_seqid and
> > so will respond with NFS4ERR_BAD_SEQID.
> > 
> > So clearly it is harmful to allow a false positive, which testing
> > so_count allows.
> > 
> > The test is nonsense because ... well... it doesn't mean anything.
> > 
> > so_count is the sum of three different counts.
> > 1/ the set of states listed on so_stateids
> > 2/ the set of active vfs locks owned by any of those states
> > 3/ various transient counts such as for conflicting locks.
> > 
> > When it is tested against '2' it is clear that one of these is the
> > transient reference obtained by find_lockowner_str_locked().  It is not
> > clear what the other one is expected to be.
> > 
> > In practice, the count is often 2 because there is precisely one state
> > on so_stateids.  If there were more, this would fail.
> > 
> > It my testing I see two circumstances when RELEASE_LOCKOWNER is called.
> > In one case, CLOSE is called before RELEASE_LOCKOWNER.  That results in
> > all the lock states being removed, and so the lockowner being discarded
> > (it is removed when there are no more references which usually happens
> > when the lock state is discarded).  When nfsd4_release_lockowner() finds
> > that the lock owner doesn't exist, it returns success.
> > 
> > The other case shows an so_count of '2' and precisely one state listed
> > in so_stateid.  It appears that the Linux client uses a separate lock
> > owner for each file resulting in one lock state per lock owner, so this
> > test on '2' is safe.  For another client it might not be safe.
> > 
> > So this patch changes check_for_locks() to use the (newish)
> > find_any_file_locked() so that it doesn't take a reference on the
> > nfs4_file and so never calls nfsd_file_put(), and so never sleeps.
> 
> More to the point, find_any_file_locked() was added by commit
> e0aa651068bf ("nfsd: don't call nfsd_file_put from client states
> seqfile display"), which was merged several months /after/ commit
> ce3c4ad7f4ce ("NFSD: Fix possible sleep during
> nfsd4_release_lockowner()").

Yes.  To flesh out the history:
nfsd_file_put() was added in v5.4.  In earlier kernels check_for_locks()
would never sleep.  However the problem patch was backported 4.9, 4.14,
and 4.19 and should be reverted.

find_any_file_locked() was added in v6.2 so when this patch is
backported to 5.4, 5.10, 5.15, 5.17 - 6.1 it needs to include
find_and_file_locked()

The patch should apply unchanged to stable kernels 6.2 and later.

> 
> Not having to deal with nfsd_file_put() in check_for_locks is a Good
> Thing.

:-)
Makes me wonder if there is anywhere is were we don't want
nfsd_file_put() ... but I cannot find any obvious candidates for change.

> 
> Am I correct in observing that the new check_for_locks() is the only
> place where flc_lock and fi_lock are held concurrently?

That is what I see - yes.
fi_lock is taken inside cl_lock elsewhere, and we preserve the ordering
in this patch.
I cannot see that any nfsd locks are taken when flc_lock is held, so it
is safe to take it while fi_lock and cl_lock are held.

Thanks,
NeilBrown