Re: Question about RDMA server...

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, 2022-03-31 at 18:53 +0000, Chuck Lever III wrote:
> 
> > On Mar 31, 2022, at 12:24 PM, Trond Myklebust
> > <trondmy@xxxxxxxxxxxxxxx> wrote:
> > 
> > On Thu, 2022-03-31 at 16:20 +0000, Chuck Lever III wrote:
> > > 
> > > > On Mar 31, 2022, at 12:15 PM, Trond Myklebust
> > > > <trondmy@xxxxxxxxxxxxxxx> wrote:
> > > > 
> > > > Hmm... Here's another thought. What if this were a deferred
> > > > request
> > > > that is being replayed after an upcall to mountd or the
> > > > idmapper?
> > > > It
> > > > would mean that the synchronous wait in cache_defer_req()
> > > > failed,
> > > > so it
> > > > is going to be rare, but it could happen on a congested system.
> > > > 
> > > > AFAICS, svc_defer() does _not_ save rqstp->rq_xprt_ctxt, so
> > > > svc_deferred_recv() won't restore it either.
> > > 
> > > True, but TCP and UDP both use rq_xprt_ctxt, so wouldn't we have
> > > seen this problem before on a socket transport?
> > 
> > TCP does not set rq_xprt_ctxt, and nobody really uses UDP these
> > days.
> > 
> > > I need to audit code to see if saving rq_xprt_ctxt in svc_defer()
> > > is safe and reasonable to do. Maybe Bruce has a thought.
> > 
> > It should be safe for the UDP case, AFAICS. I have no opinion as of
> > yet
> > about how safe it is to do with RDMA.
> 
> It's plausible that a deferred request could be replayed, but I don't
> understand the deferral mechanism enough to know whether the rctxt
> would be released before the deferred request could be handled. It
> doesn't look like it would, but I could misunderstand something.
> 
> There's a longstanding testing gap here: None of my test workloads
> appear to force a request deferral. I don't recall Bruce having
> such a test either.
> 
> It would be nice if we had something that could force the use of the
> deferral path, like a command line option for mountd that would cause
> it to sleep for several seconds before responding to an upcall. It
> might also be done with the kernel's fault injector.
> 

You just need some mechanism that causes svc_get_next_xprt() to set
rqstp->rq_chandle.thread_wait to the value '0'.

-- 
Trond Myklebust
Linux NFS client maintainer, Hammerspace
trond.myklebust@xxxxxxxxxxxxxxx






[Index of Archives]     [Linux Filesystem Development]     [Linux USB Development]     [Linux Media Development]     [Video for Linux]     [Linux NILFS]     [Linux Audio Users]     [Yosemite Info]     [Linux SCSI]

  Powered by Linux