On 11/24/20, 3:06 PM, "Frank van der Linden" <fllinden@xxxxxxxxxx> wrote: NetApp Security WARNING: This is an external email. Do not click links or open attachments unless you recognize the sender and know the content is safe. On Tue, Nov 24, 2020 at 12:26:32PM -0500, Chuck Lever wrote: > CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe. > > > > By switching to an XFS-backed export, I am able to reproduce the > ibcomp worker crash on my client with xfstests generic/013. > > For the failing LISTXATTRS operation, xdr_inline_pages() is called > with page_len=12 and buflen=128. Then: > > - Because buflen is small, rpcrdma_marshal_req will not set up a > Reply chunk and the rpcrdma's XDRBUF_SPARSE_PAGES logic does not > get invoked at all. > > - Because page_len is non-zero, rpcrdma_inline_fixup() tries to > copy received data into rq_rcv_buf->pages, but they're missing. > > The result is that the ibcomp worker faults and dies. Sometimes that > causes a visible crash, and sometimes it results in a transport > hang without other symptoms. > > RPC/RDMA's XDRBUF_SPARSE_PAGES support is not entirely correct, and > should eventually be fixed or replaced. However, my preference is > that upper-layer operations should explicitly allocate their receive > buffers (using GFP_KERNEL) when possible, rather than relying on > XDRBUF_SPARSE_PAGES. > > Reported-by: Olga kornievskaia <kolga@xxxxxxxxxx> > Suggested-by: Olga kornievskaia <kolga@xxxxxxxxxx> > Signed-off-by: Chuck Lever <chuck.lever@xxxxxxxxxx> > --- > fs/nfs/nfs42proc.c | 17 ++++++++++------- > fs/nfs/nfs42xdr.c | 1 - > 2 files changed, 10 insertions(+), 8 deletions(-) > > Hi- > > I like Olga's proposed approach. What do you think of this patch? > > > diff --git a/fs/nfs/nfs42proc.c b/fs/nfs/nfs42proc.c > index 2b2211d1234e..24810305ec1c 100644 > --- a/fs/nfs/nfs42proc.c > +++ b/fs/nfs/nfs42proc.c > @@ -1241,7 +1241,7 @@ static ssize_t _nfs42_proc_listxattrs(struct inode *inode, void *buf, > .rpc_resp = &res, > }; > u32 xdrlen; > - int ret, np; > + int ret, np, i; > > > res.scratch = alloc_page(GFP_KERNEL); > @@ -1253,10 +1253,14 @@ static ssize_t _nfs42_proc_listxattrs(struct inode *inode, void *buf, > xdrlen = server->lxasize; > np = xdrlen / PAGE_SIZE + 1; > > + ret = -ENOMEM; > pages = kcalloc(np, sizeof(struct page *), GFP_KERNEL); > - if (pages == NULL) { > - __free_page(res.scratch); > - return -ENOMEM; > + if (pages == NULL) > + goto out_free; > + for (i = 0; i < np; i++) { > + pages[i] = alloc_page(GFP_KERNEL); > + if (!pages[i]) > + goto out_free; > } > > arg.xattr_pages = pages; > @@ -1271,14 +1275,13 @@ static ssize_t _nfs42_proc_listxattrs(struct inode *inode, void *buf, > *eofp = res.eof; > } > > +out_free: > while (--np >= 0) { > if (pages[np]) > __free_page(pages[np]); > } > - > - __free_page(res.scratch); > kfree(pages); > - > + __free_page(res.scratch); > return ret; > > } > diff --git a/fs/nfs/nfs42xdr.c b/fs/nfs/nfs42xdr.c > index 6e060a88f98c..8432bd6b95f0 100644 > --- a/fs/nfs/nfs42xdr.c > +++ b/fs/nfs/nfs42xdr.c > @@ -1528,7 +1528,6 @@ static void nfs4_xdr_enc_listxattrs(struct rpc_rqst *req, > > rpc_prepare_reply_pages(req, args->xattr_pages, 0, args->count, > hdr.replen); > - req->rq_rcv_buf.flags |= XDRBUF_SPARSE_PAGES; > > encode_nops(&hdr); > } > > I can see why this is the simplest and most pragmatic solution, so it's fine with me. Why doesn't this happen with getxattr? Do we need to convert that too? [olga] I don't know if GETXATTR/SETXATTR works. I'm not sure what tests exercise those operations. I just ran into the fact that generic/013 wasn't passing. And I don't see that it's an xattr specific tests. I'm not sure how it ends up triggering is usage of xattr. What did you use to test this feature? The basic issue here is that the RPC code does not deal with inlined data that exceeds PAGE_SIZE. That can only be done with raw pages. Since the upper layer has already allocated a buffer in the case of listxattr and getxattr, I would love to be able to just XDR code in to that buffer, and void the whole alloc+copy situation. But sadly, it might be > PAGE_SIZE, so the XDR code doesn't allow it. It's not all bad, having to use pages allows them to be directly hooked in to the cache in the case of getxattr, but for listxattr, decoding directly in to the provided buffer would be nice. Hm, I wonder if that restriction actually holds for listxattr - the invidual XDR items (xattr names) should never exceed PAGE_SIZE.. - Frank