Re: [RFC PATCH 2/2] RDMA/rxe: Add RDMA Atomic Write operation

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 12/31/2021 3:29 AM, yangx.jy@xxxxxxxxxxx wrote:
On 2021/12/31 5:39, Tom Talpey wrote:

On 12/30/2021 7:14 AM, Xiao Yang wrote:
This patch implements RDMA Atomic Write operation for RC service.

Signed-off-by: Xiao Yang <yangx.jy@xxxxxxxxxxx>
---
   drivers/infiniband/sw/rxe/rxe_comp.c   |  4 +++
   drivers/infiniband/sw/rxe/rxe_opcode.c | 18 ++++++++++
   drivers/infiniband/sw/rxe/rxe_opcode.h |  3 ++
   drivers/infiniband/sw/rxe/rxe_qp.c     |  3 +-
   drivers/infiniband/sw/rxe/rxe_req.c    | 10 ++++--
   drivers/infiniband/sw/rxe/rxe_resp.c   | 49 +++++++++++++++++++++-----
   include/rdma/ib_pack.h                 |  2 ++
   include/rdma/ib_verbs.h                |  2 ++
   include/uapi/rdma/ib_user_verbs.h      |  2 ++
   9 files changed, 81 insertions(+), 12 deletions(-)


<snip>

+/* Guarantee atomicity of atomic write operations at the machine
level. */
+static DEFINE_SPINLOCK(atomic_write_ops_lock);
+
+static enum resp_states process_atomic_write(struct rxe_qp *qp,
+                         struct rxe_pkt_info *pkt)
+{
+    u64 *src, *dst;
+    struct rxe_mr *mr = qp->resp.mr;
+
+    src = payload_addr(pkt);
+
+    dst = iova_to_vaddr(mr, qp->resp.va + qp->resp.offset,
sizeof(u64));
+    if (!dst || (uintptr_t)dst & 7)
+        return RESPST_ERR_MISALIGNED_ATOMIC;
+
+    spin_lock_bh(&atomic_write_ops_lock);
+    *dst = *src;
+    spin_unlock_bh(&atomic_write_ops_lock);

Spinlock protection is insufficient! Using a spinlock protects only
the atomicity of the store operation with respect to another remote
atomic_write operation. But the semantics of RDMA_ATOMIC_WRITE go much
further. The operation requires a fully atomic bus transaction, across
the memory complex and with respect to CPU, PCI, and other sources.
And this guarantee needs to apply to all architectures, including ones
with noncoherent caches (software consistency).

Because RXE is a software provider, I believe the most natural approach
here is to use an atomic64_set(dst, *src). But I'm not certain this
is supported on all the target architectures, therefore it may require
some additional failure checks, such as stricter alignment than testing
(dst & 7), or returning a failure if atomic64 operations are not
available. I especially think the ARM and PPC platforms will need
an expert review.
Hi Tom,

Thanks a lot for the detailed suggestion.

I will try to use atomic64_set() and add additional failure checks.
By the way, process_atomic() uses the same method(spinlock + assignment
expression),
so do you think we also need to update it by using atomic64_set()?

It is *criticial* for you to understand that the ATOMIC_WRITE has a
much stronger semantic than ordinary RDMA atomics.

The ordinary atomics are only atomic from the perspective of a single
connection and a single adapter. And the result is not placed in memory
atomically, only its execution in the adapter is processed that way.
And the final result is only consistently visible to the application
after a completion event is delivered. This rather huge compromise is
because when atomics were first designed, there were no PCI primitives
which supported atomics.

An RDMA atomic_write operation is atomic all the way to memory. It
must to be implemented in a platform operation which is similarly
atomic. If implemented by a PCIe adapter, it would use the newer
PCIe atomics to perform a locked read-modify-write. If implemented
in software as you are doing, it must use a similar local r-m-w
instruction, or the platform equivalent if necessary.

Tom.



[Index of Archives]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Photo]     [Yosemite News]     [Yosemite Photos]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux