On 12/17/24 10:30 AM, Li Lingfeng wrote:
Hi,
after analysis, we think that this issue is not introduced by commit
2d4a532d385f ("nfsd: ensure that clp->cl_revoked list is protected by
clp->cl_lock") but by commit 83e733161fde ("nfsd: avoid race after
unhash_delegation_locked()").
Therefore, kernel versions earlier than 6.9 do not involve this issue.
A more practical question is: has anyone reproduced the reported crash
on a pre-v6.9 kernel?
I recall (dimly) that we knew that 8dd91e8d31fe ("nfsd: fix race between
laundromat and free_stateid") could not be cleanly applied before v6.9.
It was less clear at the time whether a more extensive LTS backport
would be required.
// normal case 1 -- free deleg by delegreturn
1) OP_DELEGRETURN
nfsd4_delegreturn
nfsd4_lookup_stateid
destroy_delegation
destroy_unhashed_deleg
nfs4_unlock_deleg_lease
vfs_setlease // unlock
nfs4_put_stid // put last refcount
idr_remove // remove from cl_stateids
s->sc_free // free deleg
2) OP_FREE_STATEID
nfsd4_free_stateid
find_stateid_locked // can not find the deleg in cl_stateids
// normal case 2 -- free deleg by laundromat
nfs4_laundromat
state_expired
unhash_delegation_locked // set NFS4_REVOKED_DELEG_STID
list_add // add the deleg to reaplist
list_first_entry // get the deleg from reaplist
revoke_delegation
destroy_unhashed_deleg
nfs4_unlock_deleg_lease
nfs4_put_stid
// abnormal case
nfs4_laundromat
state_expired
unhash_delegation_locked
// set NFS4_REVOKED_DELEG_STID
list_add
// add the deleg to reaplist
1) OP_DELEGRETURN
nfsd4_delegreturn
nfsd4_lookup_stateid
nfsd4_stid_check_stateid_generation
nfsd4_verify_open_stid
// check NFS4_REVOKED_DELEG_STID
// and return nfserr_deleg_revoked
// skip destroy_delegation
2) OP_FREE_STATEID
nfsd4_free_stateid
// check NFS4_REVOKED_DELEG_STID
list_del_init
// remove deleg from reaplist
nfs4_put_stid
// free deleg
list_first_entry
// cant not get the deleg from reaplist
Before commit 83e733161fde ("nfsd: avoid race after
unhash_delegation_locked()"), nfs4_laundromat --> unhash_delegation_locked
would not set NFS4_REVOKED_DELEG_STID for the deleg.
So the description "it marks the delegation stid revoked" in the CVE fix
patch does not hold true. And the OP_FREE_STATEID operation will not
release the deleg.
Thanks.
在 2024/11/6 1:10, Greg Kroah-Hartman 写道:
Description
===========
In the Linux kernel, the following vulnerability has been resolved:
nfsd: fix race between laundromat and free_stateid
There is a race between laundromat handling of revoked delegations
and a client sending free_stateid operation. Laundromat thread
finds that delegation has expired and needs to be revoked so it
marks the delegation stid revoked and it puts it on a reaper list
but then it unlock the state lock and the actual delegation revocation
happens without the lock. Once the stid is marked revoked a racing
free_stateid processing thread does the following (1) it calls
list_del_init() which removes it from the reaper list and (2) frees
the delegation stid structure. The laundromat thread ends up not
calling the revoke_delegation() function for this particular delegation
but that means it will no release the lock lease that exists on
the file.
Now, a new open for this file comes in and ends up finding that
lease list isn't empty and calls nfsd_breaker_owns_lease() which ends
up trying to derefence a freed delegation stateid. Leading to the
followint use-after-free KASAN warning:
kernel:
==================================================================
kernel: BUG: KASAN: slab-use-after-free in
nfsd_breaker_owns_lease+0x140/0x160 [nfsd]
kernel: Read of size 8 at addr ffff0000e73cd0c8 by task nfsd/6205
kernel:
kernel: CPU: 2 UID: 0 PID: 6205 Comm: nfsd Kdump: loaded Not tainted
6.11.0-rc7+ #9
kernel: Hardware name: Apple Inc. Apple Virtualization Generic
Platform, BIOS 2069.0.0.0.0 08/03/2024
kernel: Call trace:
kernel: dump_backtrace+0x98/0x120
kernel: show_stack+0x1c/0x30
kernel: dump_stack_lvl+0x80/0xe8
kernel: print_address_description.constprop.0+0x84/0x390
kernel: print_report+0xa4/0x268
kernel: kasan_report+0xb4/0xf8
kernel: __asan_report_load8_noabort+0x1c/0x28
kernel: nfsd_breaker_owns_lease+0x140/0x160 [nfsd]
kernel: nfsd_file_do_acquire+0xb3c/0x11d0 [nfsd]
kernel: nfsd_file_acquire_opened+0x84/0x110 [nfsd]
kernel: nfs4_get_vfs_file+0x634/0x958 [nfsd]
kernel: nfsd4_process_open2+0xa40/0x1a40 [nfsd]
kernel: nfsd4_open+0xa08/0xe80 [nfsd]
kernel: nfsd4_proc_compound+0xb8c/0x2130 [nfsd]
kernel: nfsd_dispatch+0x22c/0x718 [nfsd]
kernel: svc_process_common+0x8e8/0x1960 [sunrpc]
kernel: svc_process+0x3d4/0x7e0 [sunrpc]
kernel: svc_handle_xprt+0x828/0xe10 [sunrpc]
kernel: svc_recv+0x2cc/0x6a8 [sunrpc]
kernel: nfsd+0x270/0x400 [nfsd]
kernel: kthread+0x288/0x310
kernel: ret_from_fork+0x10/0x20
This patch proposes a fixed that's based on adding 2 new additional
stid's sc_status values that help coordinate between the laundromat
and other operations (nfsd4_free_stateid() and nfsd4_delegreturn()).
First to make sure, that once the stid is marked revoked, it is not
removed by the nfsd4_free_stateid(), the laundromat take a reference
on the stateid. Then, coordinating whether the stid has been put
on the cl_revoked list or we are processing FREE_STATEID and need to
make sure to remove it from the list, each check that state and act
accordingly. If laundromat has added to the cl_revoke list before
the arrival of FREE_STATEID, then nfsd4_free_stateid() knows to remove
it from the list. If nfsd4_free_stateid() finds that operations arrived
before laundromat has placed it on cl_revoke list, it marks the state
freed and then laundromat will no longer add it to the list.
Also, for nfsd4_delegreturn() when looking for the specified stid,
we need to access stid that are marked removed or freeable, it means
the laundromat has started processing it but hasn't finished and this
delegreturn needs to return nfserr_deleg_revoked and not
nfserr_bad_stateid. The latter will not trigger a FREE_STATEID and the
lack of it will leave this stid on the cl_revoked list indefinitely.
The Linux kernel CVE team has assigned CVE-2024-50106 to this issue.
Affected and fixed versions
===========================
Issue introduced in 3.17 with commit 2d4a532d385f and fixed in
6.11.6 with commit 967faa26f313
Issue introduced in 3.17 with commit 2d4a532d385f and fixed in
6.12-rc5 with commit 8dd91e8d31fe
Please see https://www.kernel.org for a full list of currently supported
kernel versions by the kernel community.
Unaffected versions might change over time as fixes are backported to
older supported kernel versions. The official CVE entry at
https://cve.org/CVERecord/?id=CVE-2024-50106
will be updated if fixes are backported, please check that for the most
up to date information about this issue.
Affected files
==============
The file(s) affected by this issue are:
fs/nfsd/nfs4state.c
fs/nfsd/state.h
Mitigation
==========
The Linux kernel CVE team recommends that you update to the latest
stable kernel version for this, and many other bugfixes. Individual
changes are never tested alone, but rather are part of a larger kernel
release. Cherry-picking individual commits is not recommended or
supported by the Linux kernel community at all. If however, updating to
the latest release is impossible, the individual changes to resolve this
issue can be found at these commits:
https://git.kernel.org/stable/
c/967faa26f313a62e7bebc55d5b8122eaee43b929
https://git.kernel.org/stable/
c/8dd91e8d31febf4d9cca3ae1bb4771d33ae7ee5a
--
Chuck Lever