On Mon, 2022-04-04 at 15:34 +0000, Chuck Lever III wrote: > > > > On Mar 31, 2022, at 4:08 PM, Trond Myklebust > > <trondmy@xxxxxxxxxxxxxxx> wrote: > > > > On Thu, 2022-03-31 at 18:53 +0000, Chuck Lever III wrote: > > > > > > > On Mar 31, 2022, at 12:24 PM, Trond Myklebust > > > > <trondmy@xxxxxxxxxxxxxxx> wrote: > > > > > > > > On Thu, 2022-03-31 at 16:20 +0000, Chuck Lever III wrote: > > > > > > > > > > > On Mar 31, 2022, at 12:15 PM, Trond Myklebust > > > > > > <trondmy@xxxxxxxxxxxxxxx> wrote: > > > > > > > > > > > > Hmm... Here's another thought. What if this were a deferred > > > > > > request > > > > > > that is being replayed after an upcall to mountd or the > > > > > > idmapper? > > > > > > It > > > > > > would mean that the synchronous wait in cache_defer_req() > > > > > > failed, > > > > > > so it > > > > > > is going to be rare, but it could happen on a congested > > > > > > system. > > > > > > > > > > > > AFAICS, svc_defer() does _not_ save rqstp->rq_xprt_ctxt, so > > > > > > svc_deferred_recv() won't restore it either. > > > > > > > > > > True, but TCP and UDP both use rq_xprt_ctxt, so wouldn't we > > > > > have > > > > > seen this problem before on a socket transport? > > > > > > > > TCP does not set rq_xprt_ctxt, and nobody really uses UDP these > > > > days. > > > > > > > > > I need to audit code to see if saving rq_xprt_ctxt in > > > > > svc_defer() > > > > > is safe and reasonable to do. Maybe Bruce has a thought. > > > > > > > > It should be safe for the UDP case, AFAICS. I have no opinion > > > > as of > > > > yet > > > > about how safe it is to do with RDMA. > > > > > > It's plausible that a deferred request could be replayed, but I > > > don't > > > understand the deferral mechanism enough to know whether the > > > rctxt > > > would be released before the deferred request could be handled. > > > It > > > doesn't look like it would, but I could misunderstand something. > > > > > > There's a longstanding testing gap here: None of my test > > > workloads > > > appear to force a request deferral. I don't recall Bruce having > > > such a test either. > > > > > > It would be nice if we had something that could force the use of > > > the > > > deferral path, like a command line option for mountd that would > > > cause > > > it to sleep for several seconds before responding to an upcall. > > > It > > > might also be done with the kernel's fault injector. > > > > You just need some mechanism that causes svc_get_next_xprt() to set > > rqstp->rq_chandle.thread_wait to the value '0'. > > I gotta ask: Why does cache_defer_req() check if thread_wait is zero? > Is there some code in the kernel now that sets it to zero? I see only > these two values: > > 784 if (!test_bit(SP_CONGESTED, &pool->sp_flags)) > 785 rqstp->rq_chandle.thread_wait = 5*HZ; > 786 else > 787 rqstp->rq_chandle.thread_wait = 1*HZ; That test goes all the way back to commit f16b6e8d838b ("sunrpc/cache: allow threads to block while waiting for cache update."), so probably more of a question for Neil. I don't see any immediate problems with removing it: even if someone does set a value of 0, the wait code itself should cope. -- Trond Myklebust Linux NFS client maintainer, Hammerspace trond.myklebust@xxxxxxxxxxxxxxx