Re: [PATCH v2 0/7] nfsd: CB_SEQUENCE error handling fixes and cleanups

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Jan 29, 2025 at 10:09:14AM -0500, Chuck Lever wrote:
> On 1/29/25 10:01 AM, Jeff Layton wrote:
> > On Wed, 2025-01-29 at 09:40 -0500, J. Bruce Fields wrote:
> > > On Wed, Jan 29, 2025 at 09:27:02AM -0500, Jeff Layton wrote:
> > > > On Wed, 2025-01-29 at 09:21 -0500, J. Bruce Fields wrote:
> > > > > On Wed, Jan 29, 2025 at 08:39:53AM -0500, Jeff Layton wrote:
> > > > > > While looking over the CB_SEQUENCE error handling, I discovered that
> > > > > > callbacks don't hold a reference to a session, and the
> > > > > > clp->cl_cb_session could easily change between request and response.
> > > > > > If that happens at an inopportune time, there could be UAFs or weird
> > > > > > slot/sequence handling problems.
> > > > > 
> > > > > Nobody should place too much faith in my understanding of how any of
> > > > > this works at this point, but....  My vague memory is that a lot of
> > > > > things are serialized simply by being run only on the cl_callback_wq.
> > > > > Modifying clp->cl_cb_session is such a thing.
> > > > 
> > > > It is, but that doesn't save us here. The workqueue is just there to
> > > > submit jobs to the RPC client. Once that happens they are run via
> > > > rpciod's workqueue (and in parallel with one another since they're
> > > > async RPC calls).
> > > > 
> > > > So, it's possible that while we're waiting for a response from one
> > > > callback, another is submitted, and that workqueue job changes the
> > > > clp->cl_cb_session.
> > > 
> > > I think it calls rpc_shutdown_client() before changing
> > > clp->cl_cb_session.
> > > 
> > 
> > It does, but the cl_cb_session doesn't carry a reference. My worry was
> > that the client could call a DESTROY_SESSION at any time.
> > 
> > Now that I look though, you may be right that that's enough to ensure
> > it because nfsd4_destroy_session() calls nfsd4_probe_callback_sync()
> > before putting the session reference.
> > 
> > Still, that is a lot of reliance on these things happening in a
> > particular order.
> > 
> > > (Though I'm not sure whether rpc_shutdown_client guarantees that all rpc
> > > processing for the client is completed before returning?)
> > > 
> > 
> > FWIW, it does wait for them to be killed:
> > 
> >          while (!list_empty(&clnt->cl_tasks)) {
> >                  rpc_killall_tasks(clnt);
> >                  wait_event_timeout(destroy_wait,
> >                          list_empty(&clnt->cl_tasks), 1*HZ);
> >          }
> > 
> > I'm not crazy about the fact that it does that synchronously in the
> > workqueue job, but I guess not much else can be happening with
> > callbacks until this completes.
> 
> Bruce, note rpc_shutdown_client() can block indefinitely due to
> bugs in NFSD's callback completion handlers. That's one of the
> main reasons this is getting attention right not.

Got it.

I mean, my apologies for that code, I'm not at all attached to the way
it works right now.  But I wonder whether what's mainly needed is some
test writing.

It feels like a lot of subtle code designed to handle cases that
probably aren't much exercised by real clients.

--b.

> > > > > > This series changes the nfsd4_session to be RCU-freed, and then adds a
> > > > > > new method of session refcounting that is compatible with the old.
> > > > > > nfsd4_callback RPCs will now hold a lightweight reference to the session
> > > > > > in addition to the slot. Then, all of the callback handling is switched
> > > > > > to use that session instead of dereferencing clp->cb_cb_session.
> > > > > > I've also reworked the error handling in nfsd4_cb_sequence_done()
> > > > > > based on review comments, and lifted the v4.0 handing out of that
> > > > > > function.
> > > > > > 
> > > > > > This passes pynfs, nfstests, and fstests for me, but I'm not sure how
> > > > > > much any of that stresses the backchannel's error handling.
> > > > > > 
> > > > > > These should probably go in via Chuck's tree, but the last patch touches
> > > > > > some NFS cnd sunrpc client code, so it'd be good to have R-b's or A-b's
> > > > > > from Trond and/or Anna on that one.
> > > > > > 
> > > > > > Signed-off-by: Jeff Layton <jlayton@xxxxxxxxxx>
> > > > > > ---
> > > > > > Changes in v2:
> > > > > > - make nfsd4_session be RCU-freed
> > > > > > - change code to keep reference to session over callback RPCs
> > > > > > - rework error handling in nfsd4_cb_sequence_done()
> > > > > > - move NFSv4.0 handling out of nfsd4_cb_sequence_done()
> > > > > > - Link to v1: https://lore.kernel.org/r/20250123-nfsd-6-14-v1-0-c1137a4fa2ae@xxxxxxxxxx
> > > > > > 
> > > > > > ---
> > > > > > Jeff Layton (7):
> > > > > >        nfsd: add routines to get/put session references for callbacks
> > > > > >        nfsd: make clp->cl_cb_session be an RCU managed pointer
> > > > > >        nfsd: add a cb_ses pointer to nfsd4_callback and use it instead of clp->cb_cb_session
> > > > > >        nfsd: overhaul CB_SEQUENCE error handling
> > > > > >        nfsd: remove unneeded forward declaration of nfsd4_mark_cb_fault()
> > > > > >        nfsd: lift NFSv4.0 handling out of nfsd4_cb_sequence_done()
> > > > > >        sunrpc: make rpc_restart_call() and rpc_restart_call_prepare() void return
> > > > > > 
> > > > > >   fs/nfs/nfs4proc.c           |  12 ++-
> > > > > >   fs/nfsd/nfs4callback.c      | 212 ++++++++++++++++++++++++++++++++------------
> > > > > >   fs/nfsd/nfs4state.c         |  43 ++++++++-
> > > > > >   fs/nfsd/state.h             |   6 +-
> > > > > >   fs/nfsd/trace.h             |   6 +-
> > > > > >   include/linux/sunrpc/clnt.h |   4 +-
> > > > > >   net/sunrpc/clnt.c           |   7 +-
> > > > > >   7 files changed, 210 insertions(+), 80 deletions(-)
> > > > > > ---
> > > > > > base-commit: a05af3c6103b703d1d38d8180b3ebbe0a03c2f07
> > > > > > change-id: 20250123-nfsd-6-14-b0797e385dc0
> > > > > > 
> > > > > > Best regards,
> > > > > > -- 
> > > > > > Jeff Layton <jlayton@xxxxxxxxxx>
> > > > 
> > > > -- 
> > > > Jeff Layton <jlayton@xxxxxxxxxx>
> > 
> 
> 
> -- 
> Chuck Lever




[Index of Archives]     [Linux Filesystem Development]     [Linux USB Development]     [Linux Media Development]     [Video for Linux]     [Linux NILFS]     [Linux Audio Users]     [Yosemite Info]     [Linux SCSI]

  Powered by Linux