Re: [PATCH 0/3] Fix RELEASE_LOCKOWNER

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




> On Feb 2, 2024, at 7:26 PM, NeilBrown <neilb@xxxxxxx> wrote:
> 
> On Sat, 03 Feb 2024, Chuck Lever III wrote:
>> 
>> 
>>> On Feb 1, 2024, at 5:24 PM, NeilBrown <neilb@xxxxxxx> wrote:
>>> 
>>> On Fri, 02 Feb 2024, Chuck Lever wrote:
>>>> Passes pynfs, fstests, and the git regression suite. Please apply
>>>> these to origin/linux-5.4.y.
>>> 
>>> I should have mentioned this a day or two ago but I hadn't quite made
>>> all the connection yet...
>>> 
>>> The RELEASE_LOCKOWNER bug was masking a double-free bug that was fixed
>>> by
>>> Commit 47446d74f170 ("nfsd4: add refcount for nfsd4_blocked_lock")
>>> which landed in v5.17 and wasn't marked as a bugfix, and so has not gone to
>>> stable kernels.
>> 
>> Then, instructions to stable@xxxxxxxxxxxxxxx:
>> 
>> Do not apply the patches I just sent for 5.15, 5.10, and 5.4. I will
>> apply 47446d74f170, run the tests again, and resend.
>> 
>> 
>>> Any kernel earlier than v5.17 that receives the RELEASE_LOCKOWNER fix
>>> also needs the nfsd4_blocked_lock fix.  There is a minor follow-up fix
>>> for that nfsd4_blocked_lock fix which Chuck queued yesterday.
>>> 
>>> The problem scenario is that an nfsd4_lock() call finds a conflicting
>>> lock and so has a reference to a particular nfsd4_blocked_lock.  A concurrent
>>> nfsd4_read_lockowner call frees all the nfsd4_blocked_locks including
>>> the one held in nfsd4_lock().  nfsd4_lock then tries to free the
>>> blocked_lock it has, and results in a double-free or a use-after-free.
>>> 
>>> Before either patch is applied, the extra reference on the lock-owner
>>> than nfsd4_lock holds causes nfsd4_realease_lockowner() to incorrectly
>>> return an error and NOT free the blocks_lock.
>>> With only the RELEASE_LOCKOWNER fix applied, the double-free happens.
>> 
>> Our test suite currently does not exercise this use case, apparently.
>> I didn't see a problem like this during testing.
>> 
> 
> Our OpenQA testing found it (as did our customer :-).
> Quoting from a bugzilla that unfortunately is not public (though might
> not be accessible to anyone with an account)
> 
> https://bugzilla.suse.com/show_bug.cgi?id=1219349
> 
>     LTP test nfslock01.sh randomly fails on the latest SLE-15SP4 and
>     SLE-15SP5 KOTD.  The failures appear only when testing NFS protocol
>     v4.0, other versions do not seem to be affected.  The test either
>     gets stuck or sometimes triggers kernel oops.  The contents of the
>     kernel backtrace vary.  All archs appear to be affected. 
> 
> Does your test suite cover v4.0?

pynfs covers v4.0 and v4.1.
I ran fstests and the git regression suite with only NFSv4.0.


> Does it include LTP ?

kdevops doesn't include an LTP workflow at this time.


> Thanks,
> NeilBrown
> 
> 
>> 
>>> With both patches applied the refcount on the nfsd4_blocked_lock prevents
>>> the double-free.
>>> 
>>> Kernels before 4.9 are (probably) not affected as they didn't have
>>> find_or_allocate_block() which takes the second reference to a shared
>>> object.  But that is ancient history - those kernels are well past EOL.
>>> 
>>> Thanks,
>>> NeilBrown
>>> 
>>> 
>>>> 
>>>> ---
>>>> 
>>>> Chuck Lever (2):
>>>>     NFSD: Modernize nfsd4_release_lockowner()
>>>>     NFSD: Add documenting comment for nfsd4_release_lockowner()
>>>> 
>>>> NeilBrown (1):
>>>>     nfsd: fix RELEASE_LOCKOWNER
>>>> 
>>>> 
>>>> fs/nfsd/nfs4state.c | 65 +++++++++++++++++++++++++--------------------
>>>> 1 file changed, 36 insertions(+), 29 deletions(-)
>>>> 
>>>> --
>>>> Chuck Lever
>>>> 
>>>> 
>>>> 
>>> 
>> 
>> --
>> Chuck Lever


--
Chuck Lever






[Index of Archives]     [Linux Kernel]     [Kernel Development Newbies]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite Hiking]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux