Re: NFS over RDMA crashing

Jeffrey Layton <jlayton@xxxxxxxxxx> · Wed, 12 Mar 2014 10:28:06 -0400

On Wed, 12 Mar 2014 10:05:24 -0400
Trond Myklebust <trond.myklebust@xxxxxxxxxxxxxxx> wrote:

> 
> On Mar 12, 2014, at 9:33, Jeff Layton <jlayton@xxxxxxxxxx> wrote:
> 
> > On Sat, 08 Mar 2014 14:13:44 -0600
> > Steve Wise <swise@xxxxxxxxxxxxxxxxxxxxx> wrote:
> > 
> >> On 3/8/2014 1:20 PM, Steve Wise wrote:
> >>> 
> >>>> I removed your change and started debugging original crash that 
> >>>> happens on top-o-tree.   Seems like rq_next_pages is screwed
> >>>> up.  It should always be >= rq_respages, yes?  I added a
> >>>> BUG_ON() to assert this in rdma_read_xdr() we hit the BUG_ON().
> >>>> Look
> >>>> 
> >>>> crash> svc_rqst.rq_next_page 0xffff8800b84e6000
> >>>> rq_next_page = 0xffff8800b84e6228
> >>>> crash> svc_rqst.rq_respages 0xffff8800b84e6000
> >>>> rq_respages = 0xffff8800b84e62a8
> >>>> 
> >>>> Any ideas Bruce/Tom?
> >>>> 
> >>> 
> >>> Guys, the patch below seems to fix the problem.  Dunno if it is 
> >>> correct though.  What do you think?
> >>> 
> >>> diff --git a/net/sunrpc/xprtrdma/svc_rdma_recvfrom.c 
> >>> b/net/sunrpc/xprtrdma/svc_rdma_recvfrom.c
> >>> index 0ce7552..6d62411 100644
> >>> --- a/net/sunrpc/xprtrdma/svc_rdma_recvfrom.c
> >>> +++ b/net/sunrpc/xprtrdma/svc_rdma_recvfrom.c
> >>> @@ -90,6 +90,7 @@ static void rdma_build_arg_xdr(struct svc_rqst
> >>> *rqstp, sge_no++;
> >>>       }
> >>>       rqstp->rq_respages = &rqstp->rq_pages[sge_no];
> >>> +       rqstp->rq_next_page = rqstp->rq_respages;
> >>> 
> >>>       /* We should never run out of SGE because the limit is
> >>> defined to
> >>>        * support the max allowed RPC data length
> >>> @@ -276,6 +277,7 @@ static int fast_reg_read_chunks(struct 
> >>> svcxprt_rdma *xprt,
> >>> 
> >>>       /* rq_respages points one past arg pages */
> >>>       rqstp->rq_respages = &rqstp->rq_arg.pages[page_no];
> >>> +       rqstp->rq_next_page = rqstp->rq_respages;
> >>> 
> >>>       /* Create the reply and chunk maps */
> >>>       offset = 0;
> >>> 
> >>> 
> >> 
> >> While this patch avoids the crashing, it apparently isn't
> >> correct...I'm getting IO errors reading files over the mount. :)
> >> 
> > 
> > I hit the same oops and tested your patch and it seems to have fixed
> > that particular panic, but I still see a bunch of other mem
> > corruption oopses even with it. I'll look more closely at that when
> > I get some time.
> > 
> > FWIW, I can easily reproduce that by simply doing something like:
> > 
> >   $ dd if=/dev/urandom of=/file/on/nfsordma/mount bs=4k count=1
> > 
> > I'm not sure why you're not seeing any panics with your patch in
> > place. Perhaps it's due to hw differences between our test rigs.
> > 
> > The EIO problem that you're seeing is likely the same client bug
> > that Chuck recently fixed in this patch:
> > 
> >   [PATCH 2/8] SUNRPC: Fix large reads on NFS/RDMA
> > 
> > AIUI, Trond is merging that set for 3.15, so I'd make sure your
> > client has those patches when testing.
> > 
> 
> Nothing is in my queue yet.
> 

Doh! Any reason not to merge that set from Chuck? They do fix a couple
of nasty client bugs...

-- 
Jeff Layton <jlayton@xxxxxxxxxx>
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html