On Mon, Nov 19, 2018 at 1:19 PM Chuck Lever <chuck.lever@xxxxxxxxxx> wrote: > > > > > On Nov 19, 2018, at 1:08 PM, Olga Kornievskaia <aglo@xxxxxxxxx> wrote: > > > > On Mon, Nov 19, 2018 at 12:59 PM Chuck Lever <chuck.lever@xxxxxxxxxx> wrote: > >> > >> > >> > >>> On Nov 19, 2018, at 12:47 PM, Olga Kornievskaia <aglo@xxxxxxxxx> wrote: > >>> > >>> On Mon, Nov 19, 2018 at 10:46 AM Chuck Lever <chuck.lever@xxxxxxxxxx> wrote: > >>>> > >>>> Place the associated RPC transaction's XID in the upper 32 bits of > >>>> each RDMA segment's rdma_offset field. These bits are currently > >>>> always zero. > >>>> > >>>> There are two reasons to do this: > >>>> > >>>> - The R_key only has 8 bits that are different from registration to > >>>> registration. The XID adds more uniqueness to each RDMA segment to > >>>> reduce the likelihood of a software bug on the server reading from > >>>> or writing into memory it's not supposed to. > >>>> > >>>> - On-the-wire RDMA Read and Write operations do not otherwise carry > >>>> any identifier that matches them up to an RPC. The XID in the > >>>> upper 32 bits will act as an eye-catcher in network captures. > >>> > >>> Is this just an "eye-catcher" or do you have plans to use it in > >>> wireshark? If the latter, then can we really do that? while a linux > >>> implementation may do that, other (or even possibly future linux) > >>> implementation might not do this. Can we justify changing the > >>> wireshark logic for it? > >> > >> No plans to change the wireshark RPC-over-RDMA dissector. > >> That would only be a valid thing to do if adding the XID > >> were made part of the RPC-over-RDMA protocol via an RFC. > > > > Agreed. Can you also help me understand the proposal (as I'm still > > trying to figure why it is useful). > > > > You are proposing to modify the RDMA segments's RDMA offset field (I > > see top 6bits are indeed always 0). I don't see how adding that helps > > an RDMA read/write message which does not have an "offset" field in it > > be matched to a particular RPC. I don't believe we have (had) any > > issues matching the initial RC Send only that contains the RDMA_MSG to > > the RPC. > > The ULP has access to only the low order 8 bits of the R_key. The > upper 24 bits are fixed for each MR. So for any given MR, there are > only 256 unique R_key values. That means the same R_key will appear > again quickly on the wire. > > The 64-bit offset field is set by the ULP, and can be essentially > any arbitrary value. Most kernel ULPs use the iova of the registered > memory. We only need the lower 32 bits for that. > > The purpose of adding junk to the offset is to make the offset > unique to that RPC transaction, just like the R_key is. This helps > make the RDMA segment co-ordinates (handle, length, offset) more > unique and thus harder to spoof. Thank you for the explanation that makes sense. > We could use random numbers in that upper 32 bits, but we have > something more handy: the RPC's XID. > > Now when you look at an RDMA Read or Write, the top 32 bits in each > RDMA segment's offset match the XID of the RPC transaction that the > RDMA operations go with. This is really a secondary benefit to the > uniquifying effect above. I find the wording "no the wire RDMA read or write" misleading. Did you really mean it as "RDMA read or write" or do you mean "RDMA_MSG" or do you mean "NFS RDMA read or write"? Because RDMA offset is not a part of the RDMA read/write (first/middle/last) packet. That's what I'm hanged up on. > > > -- > Chuck Lever > > >