Re: [PATCH 1/1] NFSv4.2: fix LISTXATTR buffer receive size

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Nov 19, 2020 at 6:26 PM Frank van der Linden
<fllinden@xxxxxxxxxx> wrote:
>
> On Thu, Nov 19, 2020 at 11:19:05AM -0500, Chuck Lever wrote:
> > > On Nov 19, 2020, at 10:09 AM, Olga Kornievskaia <olga.kornievskaia@xxxxxxxxx> wrote:
> > >
> > > On Thu, Nov 19, 2020 at 9:37 AM Chuck Lever <chuck.lever@xxxxxxxxxx> wrote:
> > >>
> > >> Hi Olga-
> > >>
> > >>> On Nov 18, 2020, at 4:44 PM, Olga Kornievskaia <olga.kornievskaia@xxxxxxxxx> wrote:
> > >>>
> > >>> Hi Chuck,
> > >>>
> > >>> The first problem I found was from 5.10-rc3 testing was from the fact
> > >>> that tail's iov_len was set to -4 and reply_chunk was trying to call
> > >>> rpcrdma_convert_kvec() for a tail that didn't exist.
> > >>>
> > >>> Do you see issues with this fix?
> > >>>
> > >>> diff --git a/net/sunrpc/xdr.c b/net/sunrpc/xdr.c
> > >>> index 71e03b930b70..2e6a228abb95 100644
> > >>> --- a/net/sunrpc/xdr.c
> > >>> +++ b/net/sunrpc/xdr.c
> > >>> @@ -193,7 +193,7 @@ xdr_inline_pages(struct xdr_buf *xdr, unsigned int offset,
> > >>>
> > >>>       tail->iov_base = buf + offset;
> > >>>       tail->iov_len = buflen - offset;
> > >>> -       if ((xdr->page_len & 3) == 0)
> > >>> +       if ((xdr->page_len & 3) == 0 && tail->iov_len)
> > >>>               tail->iov_len -= sizeof(__be32);
> > >>>
> > >>>       xdr->buflen += len;
> > >>
> > >> It's not clear to me whether only the listxattrs encoder is
> > >> not providing a receive tail kvec, or whether all the encoders
> > >> fail to provide a tail in this case.
> > >
> > > There is nothing specific that listxattr does, it just calls
> > > rpc_prepare_pages(). Isn't tail[0] always there (xdr_buf structure has
> > > tail[1] defined)?
> >
> > Flip the question on its head: Why does xdr_inline_pages() work
> > fine for all the other operations? That suggests the problem is
> > with listxattrs, not with generic code.
> >
> >
> > > But not all requests have data in the page. So as
> > > far as I understand, tail->iov_len can be 0 so not checking for it is
> > > wrong.
> >
> > The full context of the tail logic is:
> >
> >  194         tail->iov_base = buf + offset;
> >  195         tail->iov_len = buflen - offset;
> >  196         if ((xdr->page_len & 3) == 0)
> >  197                 tail->iov_len -= sizeof(__be32);
> >
> > tail->iov_len is set to buflen - offset right here. It should
> > /always/ be 4 or more. We can ensure that because the input
> > to this function is always provided by another kernel function
> > (in other words, we control all the callers).
> >
> > Why is buflen == offset for listxattrs? 6c2190b3fcbc ("NFS:
> > Fix listxattr receive buffer size") is trying to ensure
> > tail->iov_len is not zero -- that the math here gives us a
> > positive value every time.
> >
> > In nfs4_xdr_enc_listxattrs() we have:
> >
> > 1529         rpc_prepare_reply_pages(req, args->xattr_pages, 0, args->count,
> > 1530             hdr.replen);
> >
> > hdr.replen is set to NFS4_dec_listxattrs_sz.
> >
> > _nfs42_proc_listxattrs() sets args->count.
> >
> > I suspect the problem is the way _nfs42_proc_listxattrs() is
> > computing the length of xattr_pages. It includes the trailing
> > EOF boolean, but so does the decode_listxattrs_maxsz macro,
> > for instance.
> >
> > We need head->iov_len to always be one XDR_UNIT larger than
> > the position where the xattr_pages array starts. Then the
> > math in xdr_inline_pages() will work correctly. (sidebar:
> > perhaps the documenting comment for xdr_inline_pages() should
> > explain that assumption).
> >
> > So, now I agree with the assessment that 6c2190b3fcbc ("NFS:
> > Fix listxattr receive buffer size") is probably not adequate.
> > There is another subtlety to address in the way that
> > _nfs42_proc_listxattrs() computes args->count.
>
> The only thing I see wrong so far is that I believe that
> decode_listxattrs_maxsz is wrong - it shouldn't include the EOF
> word, which is accounted for in the page data part.
>
> But, it seems that this wouldn't cause the above problem. I'm
> also uncertain why this happens with RDMA, but not otherwise.
> Unfortunately, I can't test RDMA, but when I ran listxattr tests,
> I did so with KASAN enabled, and didn't see issues.

There isn't nothing special to run tests on RDMA, you just need to
compile the RXE driver and the rdma-core package have everything you
need to run soft Roce (or soft iwarp). This is how I'm testing.

> Obviously there could be a bug here, it could be that the code
> just gets lucky, but that the bug is exposed on RDMA.
>
> Is there a specific size passed to listxattr that this happens with?

First let me answer the last question, I'm running xfstest generic/013.

The latest update: updated to Trond's origing/testing which is now
based on 5.10-rc4.

1. I don't see the oops during the encoding of the listxattr
2. I'm still seeing the oops during the rdma completion. This happens
in the following scenario. Normally, there is a request for listxattr
which passes in buflen of 65536. This translates into rdma request
with a reply chunk with 2 segments. But during failure there is a
request for listxattr which passes buflen of only 8bytes. This
translates into rdma inline request which later oops in
rpcrdma_inline_fixup.

What I would like to know: (1) should _nf42_proc_listxattrs() be doing
some kind of sanity checking for passed in buflen? (2) but of course
I'm assuming even if passed in buffer is small the code shouldn't be
oops-ing.

>
> - Frank




[Index of Archives]     [Linux Filesystem Development]     [Linux USB Development]     [Linux Media Development]     [Video for Linux]     [Linux NILFS]     [Linux Audio Users]     [Yosemite Info]     [Linux SCSI]

  Powered by Linux