Re: [RFC 1/2] xprtrdma: xdr pad optimization revisted again

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Aug 30, 2021 at 1:04 PM Chuck Lever III <chuck.lever@xxxxxxxxxx> wrote:
>
> Hi Olga-
>
> > On Aug 30, 2021, at 12:53 PM, Olga Kornievskaia <olga.kornievskaia@xxxxxxxxx> wrote:
> >
> > From: Olga Kornievskaia <kolga@xxxxxxxxxx>
> >
> > Given the patch "Always provide aligned buffers to the RPC read layers",
> > RPC over RDMA doesn't need to look at the tail page and add that space
> > to the write chunk.
> >
> > For the RFC 8166 compliant server, it must not write an XDR padding
> > into the write chunk (even if space was provided). Historically
> > (before RFC 8166) Solaris RDMA server has been requiring the client
> > to provide space for the XDR padding and thus this client code has
> > existed.
>
> I don't understand this change.
>
> So, the upper layer doesn't provide XDR padding any more. That doesn't
> mean Solaris servers still aren't going to want to write into it. The
> client still has to provide this padding from somewhere.
>
> This suggests that "Always provide aligned buffers to the RPC read
> layers" breaks our interop with Solaris servers. Does it?

No, I don't believe "Always provide aligned buffers to the RPC read
layers" breaks the interoperability. THIS patch would break the
interop.

If we are not willing to break the interoperability and support only
servers that comply with RFC 8166, this patch is not needed.

> > Signed-off-by: Olga Kornievskaia <kolga@xxxxxxxxxx>
> > ---
> > net/sunrpc/xprtrdma/rpc_rdma.c | 15 ---------------
> > 1 file changed, 15 deletions(-)
> >
> > diff --git a/net/sunrpc/xprtrdma/rpc_rdma.c b/net/sunrpc/xprtrdma/rpc_rdma.c
> > index c335c1361564..2c4146bcf2a8 100644
> > --- a/net/sunrpc/xprtrdma/rpc_rdma.c
> > +++ b/net/sunrpc/xprtrdma/rpc_rdma.c
> > @@ -255,21 +255,6 @@ rpcrdma_convert_iovs(struct rpcrdma_xprt *r_xprt, struct xdr_buf *xdrbuf,
> >               page_base = 0;
> >       }
> >
> > -     if (type == rpcrdma_readch)
> > -             goto out;
> > -
> > -     /* When encoding a Write chunk, some servers need to see an
> > -      * extra segment for non-XDR-aligned Write chunks. The upper
> > -      * layer provides space in the tail iovec that may be used
> > -      * for this purpose.
> > -      */
> > -     if (type == rpcrdma_writech && r_xprt->rx_ep->re_implicit_roundup)
> > -             goto out;
> > -
> > -     if (xdrbuf->tail[0].iov_len)
>
> Instead of checking for a tail, we could check
>
>         if (xdr_pad_size(xdrbuf->page_len))
>
> and provide some tail space in that case.

I don't believe this is any different than what we have now. If the
page size is non-4byte aligned then, we would still allocate size for
the padding which "SHOULD NOT" be there. But yes it is allowed to be
there.

The problem, as you know from our offline discussion, is allocating
the tail page and including it in the write chunk for the Nvidia
environment where Nvidia doesn't support use of data (user) pages and
nfs kernel allocated pages in the same segment.

Alternatively, my ask is then to change rpcrdma_convert_iovs() to
return 2 segs instead of one: one for the pages and another for the
tail.

>
> > -             rpcrdma_convert_kvec(&xdrbuf->tail[0], seg, &n);
> > -
> > -out:
> >       if (unlikely(n > RPCRDMA_MAX_SEGS))
> >               return -EIO;
> >       return n;
> > --
> > 2.27.0
> >
>
> --
> Chuck Lever
>
>
>



[Index of Archives]     [Linux Filesystem Development]     [Linux USB Development]     [Linux Media Development]     [Video for Linux]     [Linux NILFS]     [Linux Audio Users]     [Yosemite Info]     [Linux SCSI]

  Powered by Linux