> On Mar 31, 2022, at 12:24 PM, Trond Myklebust <trondmy@xxxxxxxxxxxxxxx> wrote: > > On Thu, 2022-03-31 at 16:20 +0000, Chuck Lever III wrote: >> >>> On Mar 31, 2022, at 12:15 PM, Trond Myklebust >>> <trondmy@xxxxxxxxxxxxxxx> wrote: >>> >>> Hmm... Here's another thought. What if this were a deferred request >>> that is being replayed after an upcall to mountd or the idmapper? >>> It >>> would mean that the synchronous wait in cache_defer_req() failed, >>> so it >>> is going to be rare, but it could happen on a congested system. >>> >>> AFAICS, svc_defer() does _not_ save rqstp->rq_xprt_ctxt, so >>> svc_deferred_recv() won't restore it either. >> >> True, but TCP and UDP both use rq_xprt_ctxt, so wouldn't we have >> seen this problem before on a socket transport? > > TCP does not set rq_xprt_ctxt, and nobody really uses UDP these days. > >> I need to audit code to see if saving rq_xprt_ctxt in svc_defer() >> is safe and reasonable to do. Maybe Bruce has a thought. > > It should be safe for the UDP case, AFAICS. I have no opinion as of yet > about how safe it is to do with RDMA. It's plausible that a deferred request could be replayed, but I don't understand the deferral mechanism enough to know whether the rctxt would be released before the deferred request could be handled. It doesn't look like it would, but I could misunderstand something. There's a longstanding testing gap here: None of my test workloads appear to force a request deferral. I don't recall Bruce having such a test either. It would be nice if we had something that could force the use of the deferral path, like a command line option for mountd that would cause it to sleep for several seconds before responding to an upcall. It might also be done with the kernel's fault injector. -- Chuck Lever