Re: [PATCH 1/1] nfsd: fix race between laundromat and free_stateid

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Oct 11, 2024 at 12:17 PM Jeff Layton <jlayton@xxxxxxxxxx> wrote:
>
> On Fri, 2024-10-11 at 10:39 -0400, Chuck Lever wrote:
> > On Thu, Oct 10, 2024 at 06:18:01PM -0400, Olga Kornievskaia wrote:
> > > There is a race between laundromat handling of revoked delegations
> > > and a client sending free_stateid operation. Laundromat thread
> > > finds that delegation has expired and needs to be revoked so it
> > > marks the delegation stid revoked and it puts it on a reaper list
> > > but then it unlock the state lock and the actual delegation revocation
> > > happens without the lock. Once the stid is marked revoked a racing
> > > free_stateid processing thread does the following (1) it calls
> > > list_del_init() which removes it from the reaper list and (2) frees
> > > the delegation stid structure. The laundromat thread ends up not
> > > calling the revoke_delegation() function for this particular delegation
> > > but that means it will no release the lock lease that exists on
> > > the file.
> > >
> > > Now, a new open for this file comes in and ends up finding that
> > > lease list isn't empty and calls nfsd_breaker_owns_lease() which ends
> > > up trying to derefence a freed delegation stateid. Leading to the
> > > followint use-after-free KASAN warning:
> > >
> > > kernel: ==================================================================
> > > kernel: BUG: KASAN: slab-use-after-free in nfsd_breaker_owns_lease+0x140/0x160 [nfsd]
> > > kernel: Read of size 8 at addr ffff0000e73cd0c8 by task nfsd/6205
> > > kernel:
> > > kernel: CPU: 2 UID: 0 PID: 6205 Comm: nfsd Kdump: loaded Not tainted 6.11.0-rc7+ #9
> > > kernel: Hardware name: Apple Inc. Apple Virtualization Generic Platform, BIOS 2069.0.0.0.0 08/03/2024
> > > kernel: Call trace:
> > > kernel: dump_backtrace+0x98/0x120
> > > kernel: show_stack+0x1c/0x30
> > > kernel: dump_stack_lvl+0x80/0xe8
> > > kernel: print_address_description.constprop.0+0x84/0x390
> > > kernel: print_report+0xa4/0x268
> > > kernel: kasan_report+0xb4/0xf8
> > > kernel: __asan_report_load8_noabort+0x1c/0x28
> > > kernel: nfsd_breaker_owns_lease+0x140/0x160 [nfsd]
> > > kernel: leases_conflict+0x68/0x370
> > > kernel: __break_lease+0x204/0xc38
> > > kernel: nfsd_open_break_lease+0x8c/0xf0 [nfsd]
> > > kernel: nfsd_file_do_acquire+0xb3c/0x11d0 [nfsd]
> > > kernel: nfsd_file_acquire_opened+0x84/0x110 [nfsd]
> > > kernel: nfs4_get_vfs_file+0x634/0x958 [nfsd]
> > > kernel: nfsd4_process_open2+0xa40/0x1a40 [nfsd]
> > > kernel: nfsd4_open+0xa08/0xe80 [nfsd]
> > > kernel: nfsd4_proc_compound+0xb8c/0x2130 [nfsd]
> > > kernel: nfsd_dispatch+0x22c/0x718 [nfsd]
> > > kernel: svc_process_common+0x8e8/0x1960 [sunrpc]
> > > kernel: svc_process+0x3d4/0x7e0 [sunrpc]
> > > kernel: svc_handle_xprt+0x828/0xe10 [sunrpc]
> > > kernel: svc_recv+0x2cc/0x6a8 [sunrpc]
> > > kernel: nfsd+0x270/0x400 [nfsd]
> > > kernel: kthread+0x288/0x310
> > > kernel: ret_from_fork+0x10/0x20
> > >
> > > Proposing to have laundromat thread hold the state_lock over both
> > > marking thru revoking the delegation as well as making free_stateid
> > > acquire state_lock before accessing the list. Making sure that
> > > revoke_delegation() (ie kernel_setlease(unlock)) is called for
> > > every delegation that was revoked and added to the reaper list.
> > >
> > > CC: stable@xxxxxxxxxxxxxxx
> > > Signed-off-by: Olga Kornievskaia <okorniev@xxxxxxxxxx>
> > >
> > > --- I can't figure out the Fixes: tag. Laundromat's behaviour has
> > > been like that forever. But the free_stateid bits wont apply before
> > > the 1e3577a4521e ("SUNRPC: discard sv_refcnt, and svc_get/svc_put").
> > > But we used that fixes tag already with a previous fix for a different
> > > problem.
> > > ---
> > >  fs/nfsd/nfs4state.c | 4 +++-
> > >  1 file changed, 3 insertions(+), 1 deletion(-)
> > >
> > > diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
> > > index 9c2b1d251ab3..c97907d7fb38 100644
> > > --- a/fs/nfsd/nfs4state.c
> > > +++ b/fs/nfsd/nfs4state.c
> > > @@ -6605,13 +6605,13 @@ nfs4_laundromat(struct nfsd_net *nn)
> > >             unhash_delegation_locked(dp, SC_STATUS_REVOKED);
> > >             list_add(&dp->dl_recall_lru, &reaplist);
> > >     }
> > > -   spin_unlock(&state_lock);
> > >     while (!list_empty(&reaplist)) {
> > >             dp = list_first_entry(&reaplist, struct nfs4_delegation,
> > >                                     dl_recall_lru);
> > >             list_del_init(&dp->dl_recall_lru);
> > >             revoke_delegation(dp);
> > >     }
> > > +   spin_unlock(&state_lock);
> >
> > Code review suggests revoke_delegation() (and in particular,
> > destroy_unhashed_deleg(), must not be called while holding
> > state_lock().
> >
>
> We'd be calling nfs4_unlock_deleg_lease with a spinlock held, which is
> sort of gross.
>
> That said, I don't love this fix either.
>
> >
> > >     spin_lock(&nn->client_lock);
> > >     while (!list_empty(&nn->close_lru)) {
> > > @@ -7213,7 +7213,9 @@ nfsd4_free_stateid(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
> > >             if (s->sc_status & SC_STATUS_REVOKED) {
> > >                     spin_unlock(&s->sc_lock);
> > >                     dp = delegstateid(s);
> > > +                   spin_lock(&state_lock);
> > >                     list_del_init(&dp->dl_recall_lru);
> > > +                   spin_unlock(&state_lock);
> >
> > Existing code is inconsistent about how manipulation of
> > dl_recall_lru is protected. Most instances do use state_lock for
> > this purpose, but a few, including this one, use cl->cl_lock. Does
> > the other instance using cl_lock need review and correction as well?
> >
> > I'd prefer to see this fix make the protection of dl_recall_lru
> > consistent everywhere.
> >
>
> Agreed. The locking around the delegation handling has been a mess for
> a long time. I'd really like to have a way to fix this that didn't
> require having to rework all of this code however.
>
> How about something like this patch instead? Olga, thoughts?

I think this patch will prevent the UAF but it doesn't work for
another reason (tested it too). As the free_stateid operation can come
in before the freeable flag is set (and so the nfsd4_free_stateid
function would not do anything). But it needs to remove this
delegation from cl_revoked which the laundromat puts it on as a part
of revoked_delegation() otherwise the server never clears
recallable_state_revoked. And I think this put_stid() that
free_stateid does is also needed. So what happens is free_stateid
comes and goes and the sequence flag is set and is never cleared.

Laundromat threat when it starts revocation process it either needs to
'finish it' but it needs to make sure that if free_stateid arrives in
the meanwhile it has to wait but still run. Or I was thinking that
perhaps, we can make free_stateid act like delegreturn. but I wasn't
sure if free_stateid is allowed to act like delegreturn. but this also
fixes the problem if the free_stateid arrives and takes the work away
from the laundromat thread then free_stateid finishes the return just
like a delegreturn (which unlocks the lease). But I'm unclear if there
isn't any races between revoke_delegation and destroy_delegation.

diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index 56b261608af4..1ef6933b1ccb 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -7159,6 +7159,7 @@ nfsd4_free_stateid(struct svc_rqst *rqstp,
struct nfsd4_compound_state *cstate,
                        dp = delegstateid(s);
                        list_del_init(&dp->dl_recall_lru);
                        spin_unlock(&cl->cl_lock);
+                       destroy_delegation(dp);
                        nfs4_put_stid(s);
                        ret = nfs_ok;
                        goto out;



> [PATCH] nfsd: add new SC_STATUS_FREEABLE to prevent race with  FREE_STATEID
>
> Olga identified a race between the laundromat and FREE_STATEID handling.
> The crux of the problem is that free_stateid can proceed while the
> laundromat is still processing the revocation.
>
> Add a new SC_STATUS_FREEABLE flag that is set once the revocation is
> complete. Have nfsd4_free_stateid only consider delegations that have
> this flag set.
>
> Reported-by: Olga Kornievskaia <okorniev@xxxxxxxxxx>
> Signed-off-by: Jeff Layton <jlayton@xxxxxxxxxx>
> ---
>  fs/nfsd/nfs4state.c | 3 ++-
>  fs/nfsd/state.h     | 1 +
>  2 files changed, 3 insertions(+), 1 deletion(-)
>
> diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
> index 73c4b983c048..b71a2cc7f2dd 100644
> --- a/fs/nfsd/nfs4state.c
> +++ b/fs/nfsd/nfs4state.c
> @@ -1371,6 +1371,7 @@ static void revoke_delegation(struct nfs4_delegation *dp)
>                 spin_lock(&clp->cl_lock);
>                 refcount_inc(&dp->dl_stid.sc_count);
>                 list_add(&dp->dl_recall_lru, &clp->cl_revoked);
> +               dp->dl_stid.sc_status |= SC_STATUS_FREEABLE;
>                 spin_unlock(&clp->cl_lock);
>         }
>         destroy_unhashed_deleg(dp);
> @@ -7207,7 +7208,7 @@ nfsd4_free_stateid(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
>         spin_lock(&s->sc_lock);
>         switch (s->sc_type) {
>         case SC_TYPE_DELEG:
> -               if (s->sc_status & SC_STATUS_REVOKED) {
> +               if (s->sc_status & SC_STATUS_FREEABLE) {
>                         spin_unlock(&s->sc_lock);
>                         dp = delegstateid(s);
>                         list_del_init(&dp->dl_recall_lru);
> diff --git a/fs/nfsd/state.h b/fs/nfsd/state.h
> index 874fcab2b183..4f3b941b09d3 100644
> --- a/fs/nfsd/state.h
> +++ b/fs/nfsd/state.h
> @@ -114,6 +114,7 @@ struct nfs4_stid {
>  /* For a deleg stateid kept around only to process free_stateid's: */
>  #define SC_STATUS_REVOKED      BIT(1)
>  #define SC_STATUS_ADMIN_REVOKED        BIT(2)
> +#define SC_STATUS_FREEABLE     BIT(3)
>         unsigned short          sc_status;
>
>         struct list_head        sc_cp_list;
> --
> 2.47.0
>






[Index of Archives]     [Linux Kernel]     [Kernel Development Newbies]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite Hiking]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux