Re: [PATCH RFC] xprtrdma: Fix XDRBUF_SPARSE_PAGES support

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




> On Dec 8, 2020, at 12:03 PM, Olga Kornievskaia <aglo@xxxxxxxxx> wrote:
> 
> Hi Chuck,
> 
> Is the only user of SPARSE_PAGES is v3 ACLs?

There was one other, the gssproxy upcall, which I've proposed a fix
for, for v5.11. After that fix goes in and Frank's replacements for
GETXATTR and LISTXATTRS are merged, NFSv3 GETACL will be the only
user of XDRBUF_SPARSE_PAGES.


> Are you fixing this so that v3 ACLs would work over rdma?

As far as I know, NFSv3 GETACL works fine over RDMA because it always
expects a large Reply, and thus rpcrdma_marshal_req prepares a Reply
chunk. That triggers the code that allocates the receive pages
properly.

But you might be asking me why I'm sending this patch. Discussion?
Belt and suspenders? Fodder for backport, just in case?

It's always possible that someone might add another XDRBUF_SPARSE_PAGES
user before we are able to update NFSv3 GETACL. The below patch should
help make that situation less problematic, if it should occur.

So, after Frank's patches are merged, this fix is not addressing an
existing active crasher.


> On Fri, Dec 4, 2020 at 5:13 PM Chuck Lever <chuck.lever@xxxxxxxxxx> wrote:
>> 
>> Olga K. observed that rpcrdma_marsh_req() allocates sparse pages
>> only when it has determined that a Reply chunk is necessary. There
>> are plenty of cases where no Reply chunk is needed, but the
>> XDRBUF_SPARSE_PAGES flag is set. The result would be a crash in
>> rpcrdma_inline_fixup() when it tries to copy parts of the received
>> Reply into a missing page.
>> 
>> To avoid crashing, handle sparse page allocation up front.
>> 
>> Until XATTR support was added, this issue did not appear often
>> because the only SPARSE_PAGES consumer always expected a reply large
>> enough to require a Reply chunk.
>> 
>> Reported-by: Olga Kornievskaia <kolga@xxxxxxxxxx>
>> Signed-off-by: Chuck Lever <chuck.lever@xxxxxxxxxx>
>> ---
>> net/sunrpc/xprtrdma/rpc_rdma.c |   41 +++++++++++++++++++++++++++++++---------
>> 1 file changed, 32 insertions(+), 9 deletions(-)
>> 
>> Here's a stop-gap that can be back-ported to earlier kernels if needed.
>> Untested. Comments welcome.
>> 
>> diff --git a/net/sunrpc/xprtrdma/rpc_rdma.c b/net/sunrpc/xprtrdma/rpc_rdma.c
>> index 0f5120c7668f..09b7649fa112 100644
>> --- a/net/sunrpc/xprtrdma/rpc_rdma.c
>> +++ b/net/sunrpc/xprtrdma/rpc_rdma.c
>> @@ -179,6 +179,32 @@ rpcrdma_nonpayload_inline(const struct rpcrdma_xprt *r_xprt,
>>                r_xprt->rx_ep->re_max_inline_recv;
>> }
>> 
>> +/* ACL likes to be lazy in allocating pages. For TCP, these
>> + * pages can be allocated during receive processing. Not true
>> + * for RDMA, which must always provision receive buffers
>> + * up front.
>> + */
>> +static int
>> +rpcrdma_alloc_sparse_pages(struct rpc_rqst *rqst)
>> +{
>> +       struct xdr_buf *xb = &rqst->rq_rcv_buf;
>> +       struct page **ppages;
>> +       int len;
>> +
>> +       len = xb->page_len;
>> +       ppages = xb->pages + xdr_buf_pagecount(xb);
>> +       while (len > 0) {
>> +               if (!*ppages)
>> +                       *ppages = alloc_page(GFP_NOWAIT | __GFP_NOWARN);
>> +               if (!*ppages)
>> +                       return -ENOBUFS;
>> +               ppages++;
>> +               len -= PAGE_SIZE;
>> +       }
>> +
>> +       return 0;
>> +}
>> +
>> /* Split @vec on page boundaries into SGEs. FMR registers pages, not
>>  * a byte range. Other modes coalesce these SGEs into a single MR
>>  * when they can.
>> @@ -233,15 +259,6 @@ rpcrdma_convert_iovs(struct rpcrdma_xprt *r_xprt, struct xdr_buf *xdrbuf,
>>        ppages = xdrbuf->pages + (xdrbuf->page_base >> PAGE_SHIFT);
>>        page_base = offset_in_page(xdrbuf->page_base);
>>        while (len) {
>> -               /* ACL likes to be lazy in allocating pages - ACLs
>> -                * are small by default but can get huge.
>> -                */
>> -               if (unlikely(xdrbuf->flags & XDRBUF_SPARSE_PAGES)) {
>> -                       if (!*ppages)
>> -                               *ppages = alloc_page(GFP_NOWAIT | __GFP_NOWARN);
>> -                       if (!*ppages)
>> -                               return -ENOBUFS;
>> -               }
>>                seg->mr_page = *ppages;
>>                seg->mr_offset = (char *)page_base;
>>                seg->mr_len = min_t(u32, PAGE_SIZE - page_base, len);
>> @@ -867,6 +884,12 @@ rpcrdma_marshal_req(struct rpcrdma_xprt *r_xprt, struct rpc_rqst *rqst)
>>        __be32 *p;
>>        int ret;
>> 
>> +       if (unlikely(rqst->rq_rcv_buf.flags & XDRBUF_SPARSE_PAGES)) {
>> +               ret = rpcrdma_alloc_sparse_pages(rqst);
>> +               if (ret)
>> +                       return ret;
>> +       }
>> +
>>        rpcrdma_set_xdrlen(&req->rl_hdrbuf, 0);
>>        xdr_init_encode(xdr, &req->rl_hdrbuf, rdmab_data(req->rl_rdmabuf),
>>                        rqst);
>> 
>> 

--
Chuck Lever







[Index of Archives]     [Linux Filesystem Development]     [Linux USB Development]     [Linux Media Development]     [Video for Linux]     [Linux NILFS]     [Linux Audio Users]     [Yosemite Info]     [Linux SCSI]

  Powered by Linux