Re: [PATCH v1 4/4] xprtrdma: Plant XID in on-the-wire RDMA offset (FRWR)

Olga Kornievskaia <aglo@xxxxxxxxx> · Mon, 19 Nov 2018 13:47:19 -0500

On Mon, Nov 19, 2018 at 1:19 PM Chuck Lever <chuck.lever@xxxxxxxxxx> wrote:
>
>
>
> > On Nov 19, 2018, at 1:08 PM, Olga Kornievskaia <aglo@xxxxxxxxx> wrote:
> >
> > On Mon, Nov 19, 2018 at 12:59 PM Chuck Lever <chuck.lever@xxxxxxxxxx> wrote:
> >>
> >>
> >>
> >>> On Nov 19, 2018, at 12:47 PM, Olga Kornievskaia <aglo@xxxxxxxxx> wrote:
> >>>
> >>> On Mon, Nov 19, 2018 at 10:46 AM Chuck Lever <chuck.lever@xxxxxxxxxx> wrote:
> >>>>
> >>>> Place the associated RPC transaction's XID in the upper 32 bits of
> >>>> each RDMA segment's rdma_offset field. These bits are currently
> >>>> always zero.
> >>>>
> >>>> There are two reasons to do this:
> >>>>
> >>>> - The R_key only has 8 bits that are different from registration to
> >>>> registration. The XID adds more uniqueness to each RDMA segment to
> >>>> reduce the likelihood of a software bug on the server reading from
> >>>> or writing into memory it's not supposed to.
> >>>>
> >>>> - On-the-wire RDMA Read and Write operations do not otherwise carry
> >>>> any identifier that matches them up to an RPC. The XID in the
> >>>> upper 32 bits will act as an eye-catcher in network captures.
> >>>
> >>> Is this just an "eye-catcher" or do you have plans to use it in
> >>> wireshark? If the latter, then can we really do that? while a linux
> >>> implementation may do that, other (or even possibly future linux)
> >>> implementation might not do this. Can we justify changing the
> >>> wireshark logic for it?
> >>
> >> No plans to change the wireshark RPC-over-RDMA dissector.
> >> That would only be a valid thing to do if adding the XID
> >> were made part of the RPC-over-RDMA protocol via an RFC.
> >
> > Agreed. Can you also help me understand the proposal (as I'm still
> > trying to figure why it is useful).
> >
> > You are proposing to modify the RDMA segments's RDMA offset field (I
> > see top 6bits are indeed always 0). I don't see how adding that helps
> > an RDMA read/write message which does not have an "offset" field in it
> > be matched to a particular RPC. I don't believe we have (had) any
> > issues matching the initial RC Send only that contains the RDMA_MSG to
> > the RPC.
>
> The ULP has access to only the low order 8 bits of the R_key. The
> upper 24 bits are fixed for each MR. So for any given MR, there are
> only 256 unique R_key values. That means the same R_key will appear
> again quickly on the wire.
>
> The 64-bit offset field is set by the ULP, and can be essentially
> any arbitrary value. Most kernel ULPs use the iova of the registered
> memory. We only need the lower 32 bits for that.
>
> The purpose of adding junk to the offset is to make the offset
> unique to that RPC transaction, just like the R_key is. This helps
> make the RDMA segment co-ordinates (handle, length, offset) more
> unique and thus harder to spoof.

Thank you for the explanation that makes sense.

> We could use random numbers in that upper 32 bits, but we have
> something more handy: the RPC's XID.
>
> Now when you look at an RDMA Read or Write, the top 32 bits in each
> RDMA segment's offset match the XID of the RPC transaction that the
> RDMA operations go with. This is really a secondary benefit to the
> uniquifying effect above.

I find the wording "no the wire RDMA read or write" misleading. Did
you really mean it as "RDMA read or write" or do you mean "RDMA_MSG"
or do you mean "NFS RDMA read or write"? Because RDMA offset is not a
part of the RDMA read/write (first/middle/last) packet. That's what
I'm hanged up on.

>
>
> --
> Chuck Lever
>
>
>