Re: [PATCH] nfsd: validate the nfsd_serv pointer before calling svc_wake_up

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, 27 Jan 2025, Jeff Layton wrote:
> On Mon, 2025-01-27 at 08:53 +1100, NeilBrown wrote:
> > On Sun, 26 Jan 2025, Jeff Layton wrote:
> > > On Sun, 2025-01-26 at 13:39 +1100, NeilBrown wrote:
> > > > On Sun, 26 Jan 2025, Jeff Layton wrote:
> > > > > nfsd_file_dispose_list_delayed can be called from the filecache
> > > > > laundrette, which is shut down after the nfsd threads are shut down and
> > > > > the nfsd_serv pointer is cleared. If nn->nfsd_serv is NULL then there
> > > > > are no threads to wake.
> > > > > 
> > > > > Ensure that the nn->nfsd_serv pointer is non-NULL before calling
> > > > > svc_wake_up in nfsd_file_dispose_list_delayed. This is safe since the
> > > > > svc_serv is not freed until after the filecache laundrette is cancelled.
> > > > > 
> > > > > Fixes: ffb402596147 ("nfsd: Don't leave work of closing files to a work queue")
> > > > > Reported-by: Salvatore Bonaccorso <carnil@xxxxxxxxxx>
> > > > > Closes: https://lore.kernel.org/linux-nfs/7d9f2a8aede4f7ca9935a47e1d405643220d7946.camel@xxxxxxxxxx/
> > > > > Signed-off-by: Jeff Layton <jlayton@xxxxxxxxxx>
> > > > > ---
> > > > > This is only lightly tested, but I think it will fix the bug that
> > > > > Salvatore reported.
> > > > > ---
> > > > >  fs/nfsd/filecache.c | 11 ++++++++++-
> > > > >  1 file changed, 10 insertions(+), 1 deletion(-)
> > > > > 
> > > > > diff --git a/fs/nfsd/filecache.c b/fs/nfsd/filecache.c
> > > > > index e91c164b5ea21507659904690533a19ca43b1b64..fb2a4469b7a3c077de2dd750f43239b4af6d37b0 100644
> > > > > --- a/fs/nfsd/filecache.c
> > > > > +++ b/fs/nfsd/filecache.c
> > > > > @@ -445,11 +445,20 @@ nfsd_file_dispose_list_delayed(struct list_head *dispose)
> > > > >  						struct nfsd_file, nf_gc);
> > > > >  		struct nfsd_net *nn = net_generic(nf->nf_net, nfsd_net_id);
> > > > >  		struct nfsd_fcache_disposal *l = nn->fcache_disposal;
> > > > > +		struct svc_serv *serv;
> > > > >  
> > > > >  		spin_lock(&l->lock);
> > > > >  		list_move_tail(&nf->nf_gc, &l->freeme);
> > > > >  		spin_unlock(&l->lock);
> > > > > -		svc_wake_up(nn->nfsd_serv);
> > > > > +
> > > > > +		/*
> > > > > +		 * The filecache laundrette is shut down after the
> > > > > +		 * nn->nfsd_serv pointer is cleared, but before the
> > > > > +		 * svc_serv is freed.
> > > > > +		 */
> > > > > +		serv = nn->nfsd_serv;
> > > > 
> > > > I wonder if this should be READ_ONCE() to tell the compiler that we
> > > > could race with clearing nn->nfsd_serv.  Would the comment still be
> > > > needed?
> > > > 
> > > 
> > > I think we need a comment at least. The linkage between the laundrette
> > > and the nfsd_serv being set to NULL is very subtle. A READ_ONCE()
> > > doesn't convey that well, and is unnecessary here.
> > 
> > Why do you say "is unnecessary here" ?
> > If the code were
> >    if (nn->nfsd_serv)
> >             svc_wake_up(nn->nfsd_serv);
> > that would be wrong as nn->nfds_serv could be set to NULL between the
> > two.
> > And the C compile is allowed to load the value twice because the C memory
> > model declares that would have the same effect.
> > While I doubt it would actually change how the code is compiled, I think
> > we should have READ_ONCE() here (and I've been wrong before about what
> > the compiler will actually do).
> > 
> > 
> 
> It's unnecessary because the outcome of either case is acceptable.
> 
> When racing with shutdown, either it's NULL and the laundrette won't
> call svc_wake_up(), or it's non-NULL and it will. In the non-NULL case,
> the call to svc_wake_up() will be a no-op because the threads are shut
> down.
> 
> The vastly common case in this code is that this pointer will be non-
> NULL, because the server is running (i.e. not racing with shutdown). I
> don't see the need in making all of those accesses volatile.

One of us is confused.  I hope it isn't me.

The hypothetical problem I see is that the C compiler could generate
code to load the value "nn->nfsd_serv" twice.  The first time it is not
NULL, the second time it is NULL.
The first is used for the test, the second is passed to svc_wake_up().

Unlikely though this is, it is possible and READ_ONCE() is designed
precisely to prevent this.
To quote from include/asm-generic/rwonce.h it will
 "Prevent the compiler from merging or refetching reads"

A "volatile" access does not add any cost (in this case).  What it does
is break any aliasing that the compile might have deduced.
Even if the compiler thinks it has "nn->nfsd_serv" in a register, it
won't think it has the result of READ_ONCE(nn->nfsd_serv) in that register.
And if it needs the result of a previous READ_ONCE(nn->nfsd_serv) it
won't decide that it can just read nn->nfsd_serv again.  It MUST keep
the result of READ_ONCE(nn->nfsd_serv) somewhere until it is not needed
any more.

NeilBrown





[Index of Archives]     [Linux Filesystem Development]     [Linux USB Development]     [Linux Media Development]     [Video for Linux]     [Linux NILFS]     [Linux Audio Users]     [Yosemite Info]     [Linux SCSI]

  Powered by Linux