On Thu, Nov 19, 2020 at 6:26 PM Frank van der Linden <fllinden@xxxxxxxxxx> wrote: > > On Thu, Nov 19, 2020 at 11:19:05AM -0500, Chuck Lever wrote: > > > On Nov 19, 2020, at 10:09 AM, Olga Kornievskaia <olga.kornievskaia@xxxxxxxxx> wrote: > > > > > > On Thu, Nov 19, 2020 at 9:37 AM Chuck Lever <chuck.lever@xxxxxxxxxx> wrote: > > >> > > >> Hi Olga- > > >> > > >>> On Nov 18, 2020, at 4:44 PM, Olga Kornievskaia <olga.kornievskaia@xxxxxxxxx> wrote: > > >>> > > >>> Hi Chuck, > > >>> > > >>> The first problem I found was from 5.10-rc3 testing was from the fact > > >>> that tail's iov_len was set to -4 and reply_chunk was trying to call > > >>> rpcrdma_convert_kvec() for a tail that didn't exist. > > >>> > > >>> Do you see issues with this fix? > > >>> > > >>> diff --git a/net/sunrpc/xdr.c b/net/sunrpc/xdr.c > > >>> index 71e03b930b70..2e6a228abb95 100644 > > >>> --- a/net/sunrpc/xdr.c > > >>> +++ b/net/sunrpc/xdr.c > > >>> @@ -193,7 +193,7 @@ xdr_inline_pages(struct xdr_buf *xdr, unsigned int offset, > > >>> > > >>> tail->iov_base = buf + offset; > > >>> tail->iov_len = buflen - offset; > > >>> - if ((xdr->page_len & 3) == 0) > > >>> + if ((xdr->page_len & 3) == 0 && tail->iov_len) > > >>> tail->iov_len -= sizeof(__be32); > > >>> > > >>> xdr->buflen += len; > > >> > > >> It's not clear to me whether only the listxattrs encoder is > > >> not providing a receive tail kvec, or whether all the encoders > > >> fail to provide a tail in this case. > > > > > > There is nothing specific that listxattr does, it just calls > > > rpc_prepare_pages(). Isn't tail[0] always there (xdr_buf structure has > > > tail[1] defined)? > > > > Flip the question on its head: Why does xdr_inline_pages() work > > fine for all the other operations? That suggests the problem is > > with listxattrs, not with generic code. > > > > > > > But not all requests have data in the page. So as > > > far as I understand, tail->iov_len can be 0 so not checking for it is > > > wrong. > > > > The full context of the tail logic is: > > > > 194 tail->iov_base = buf + offset; > > 195 tail->iov_len = buflen - offset; > > 196 if ((xdr->page_len & 3) == 0) > > 197 tail->iov_len -= sizeof(__be32); > > > > tail->iov_len is set to buflen - offset right here. It should > > /always/ be 4 or more. We can ensure that because the input > > to this function is always provided by another kernel function > > (in other words, we control all the callers). > > > > Why is buflen == offset for listxattrs? 6c2190b3fcbc ("NFS: > > Fix listxattr receive buffer size") is trying to ensure > > tail->iov_len is not zero -- that the math here gives us a > > positive value every time. > > > > In nfs4_xdr_enc_listxattrs() we have: > > > > 1529 rpc_prepare_reply_pages(req, args->xattr_pages, 0, args->count, > > 1530 hdr.replen); > > > > hdr.replen is set to NFS4_dec_listxattrs_sz. > > > > _nfs42_proc_listxattrs() sets args->count. > > > > I suspect the problem is the way _nfs42_proc_listxattrs() is > > computing the length of xattr_pages. It includes the trailing > > EOF boolean, but so does the decode_listxattrs_maxsz macro, > > for instance. > > > > We need head->iov_len to always be one XDR_UNIT larger than > > the position where the xattr_pages array starts. Then the > > math in xdr_inline_pages() will work correctly. (sidebar: > > perhaps the documenting comment for xdr_inline_pages() should > > explain that assumption). > > > > So, now I agree with the assessment that 6c2190b3fcbc ("NFS: > > Fix listxattr receive buffer size") is probably not adequate. > > There is another subtlety to address in the way that > > _nfs42_proc_listxattrs() computes args->count. > > The only thing I see wrong so far is that I believe that > decode_listxattrs_maxsz is wrong - it shouldn't include the EOF > word, which is accounted for in the page data part. > > But, it seems that this wouldn't cause the above problem. I'm > also uncertain why this happens with RDMA, but not otherwise. > Unfortunately, I can't test RDMA, but when I ran listxattr tests, > I did so with KASAN enabled, and didn't see issues. There isn't nothing special to run tests on RDMA, you just need to compile the RXE driver and the rdma-core package have everything you need to run soft Roce (or soft iwarp). This is how I'm testing. > Obviously there could be a bug here, it could be that the code > just gets lucky, but that the bug is exposed on RDMA. > > Is there a specific size passed to listxattr that this happens with? First let me answer the last question, I'm running xfstest generic/013. The latest update: updated to Trond's origing/testing which is now based on 5.10-rc4. 1. I don't see the oops during the encoding of the listxattr 2. I'm still seeing the oops during the rdma completion. This happens in the following scenario. Normally, there is a request for listxattr which passes in buflen of 65536. This translates into rdma request with a reply chunk with 2 segments. But during failure there is a request for listxattr which passes buflen of only 8bytes. This translates into rdma inline request which later oops in rpcrdma_inline_fixup. What I would like to know: (1) should _nf42_proc_listxattrs() be doing some kind of sanity checking for passed in buflen? (2) but of course I'm assuming even if passed in buffer is small the code shouldn't be oops-ing. > > - Frank