[PATCH for-next 00/24] IB/hfi1: TID RDMA

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Doug and Jason,

Here is the TID RDMA series I had mentioned back at OFA. This represents a lot
of hardwork by a number of people. I asked Kaike to provide a cover letter with
some background on what TID RDMA is, I'll paste it here. We'd like this to go
for 4.19 but it is a lot of code to review so won't be surprised if it has to
push off another cycle.

Omni-Path TID RDMA Feature

Intel Omni-Path (OPA) TID RDMA support is a feature that accelerates data
movement between two OPA nodes through the IB Verbs interface. It improves
RDMA READ/WRITE performance by delivering the data payload to a user
buffer directly without any software copying.

Architecture
=============
The TID RDMA protocol is implemented on the hfi1 driver level and is
therefore transparent to the ULPs. It is designed to facilitate the data
transactions for two specific RDMA requests:
  - RDMA READ;
  - RDMA WRITE.
Previously, when a verbs data packet is received at the destination (requester
side for RDMA READ and responder side for RDMA WRITE), the data payload
is copied to the user buffer by software, which slows down the performance
significantly for large requests.

Internally, hfi1 converts qualified RDMA READ/WRITE requests into TID
RDMA READ/WRITE requests when the requests are post sent to the hfi1
driver. Non-qualified RDMA requests are handled by normal RDMA protocol.

For TID RDMA requests, hardware resources (hardware flow and TID entries)
are allocated on the destination side (the requester side for TID RDMA
READ and the responder side for TID RDMA WRITE). The information for
these resources is conveyed to the data source side (the responder side
for TID RDMA READ and the requester side for TID RDMA WRITE) and embedded
in data packets. When data packets are received by the destination,
hardware will deliver the data payload to the destination buffer without
involving software and therefore improve the performance.

Details
=======
RDMA READ/WRITE requests are qualified by the following:
  - Total data length >= 256k;
  - Totoal data length is a multiple of 4K pages.

Additional qualifications are enforced for the destination buffers:
  For RDMA RAED:
    - Each destination sge buffer is 4K aligned;
    - Each destination sge buffer is a multiple of 4K pages.
  For RDMA WRITE:
    - The destination number is 4K aligned.    

In addition, in an OPA fabric, some nodes may support TID RDMA while
others may not. As such, it is important for two transaction nodes to
exchange the information about the features they support. This discovery
mechanism is called OPA Feature Negotion (OPFN) and is described in
details in the patch series. Through OPFN, two nodes can find whether
they both support TID RDMA and subsequently convert RDMA requests into
TID RDMA requests.

---

Kaike Wan (19):
      IB/hfi1: Add OPFN and TID RDMA capability bits
      IB/hfi1: Defines for TID RDMA RcvArray programming and TID allocation
      IB/hfi: Move RC functions into a header file
      IB/hfi1: Integrate TID RDMA READ protocol into RC protocol
      IB/hfi1: Add TID RDMA READ functions
      IB/hfi1: TID RDMA flow allocation
      IB/hfi1: TID RDMA RcvArray programming and TID allocation
      IB/hfi1: Add wait mechanism for TID allocation
      IB/hfi1: Add the counter n_tidwait
      IB/hfi1: Prepare resource waits for dual leg
      IB/hfi1: Add the dual leg stub code
      IB/{hfi1, rdmavt}: Allow for extra entries in QP's s_ack_queue
      IB/hfi1: Add a s_acked_ack_queue pointer
      IB/hfi1: Add TID RDMA WRITE functionality into RDMA verbs
      IB/hfi1: Add TID RDMA WRITE functions
      IB/hfi1: Add KDETH eflags handler
      IB/hfi1: Add interlock between a TID RDMA request and other requests
      IB/hfi1: Enable TID RDMA protocol
      IB/hfi1: Add static trace for TID RDMA protocol

Mike Marciniszyn (2):
      IB/hfi1: Add field to reference the rcd from the QP priv struct
      IB/hfi1: OPFN parameter negotiation

Mitko Haralanov (3):
      IB/hfi1: Add TID RDMA files
      IB/hfi1: OPFN support discovery
      IB/hfi1: Add TID RDMA handlers


 drivers/infiniband/hw/hfi1/Makefile       |    4 
 drivers/infiniband/hw/hfi1/chip.c         |   13 
 drivers/infiniband/hw/hfi1/chip.h         |    4 
 drivers/infiniband/hw/hfi1/common.h       |   23 
 drivers/infiniband/hw/hfi1/driver.c       |   58 
 drivers/infiniband/hw/hfi1/hfi.h          |   21 
 drivers/infiniband/hw/hfi1/init.c         |   15 
 drivers/infiniband/hw/hfi1/iowait.c       |  136 +
 drivers/infiniband/hw/hfi1/iowait.h       |  245 +
 drivers/infiniband/hw/hfi1/opfn.c         |  345 ++
 drivers/infiniband/hw/hfi1/opfn.h         |  125 +
 drivers/infiniband/hw/hfi1/qp.c           |  140 +
 drivers/infiniband/hw/hfi1/qp.h           |   38 
 drivers/infiniband/hw/hfi1/rc.c           | 1147 +++++-
 drivers/infiniband/hw/hfi1/rc.h           |   93 
 drivers/infiniband/hw/hfi1/ruc.c          |   59 
 drivers/infiniband/hw/hfi1/sdma.c         |   52 
 drivers/infiniband/hw/hfi1/sdma.h         |    8 
 drivers/infiniband/hw/hfi1/sdma_txreq.h   |    3 
 drivers/infiniband/hw/hfi1/tid_rdma.c     | 5563 +++++++++++++++++++++++++++++
 drivers/infiniband/hw/hfi1/tid_rdma.h     |  343 ++
 drivers/infiniband/hw/hfi1/trace.c        |  120 +
 drivers/infiniband/hw/hfi1/trace.h        |    4 
 drivers/infiniband/hw/hfi1/trace_dbg.h    |    2 
 drivers/infiniband/hw/hfi1/trace_ibhdrs.h |   10 
 drivers/infiniband/hw/hfi1/trace_iowait.h |   96 +
 drivers/infiniband/hw/hfi1/trace_rc.h     |   50 
 drivers/infiniband/hw/hfi1/trace_rx.h     |  114 -
 drivers/infiniband/hw/hfi1/trace_tid.h    | 1618 ++++++++
 drivers/infiniband/hw/hfi1/trace_tx.h     |   20 
 drivers/infiniband/hw/hfi1/uc.c           |    3 
 drivers/infiniband/hw/hfi1/user_exp_rcv.h |    4 
 drivers/infiniband/hw/hfi1/user_sdma.c    |   18 
 drivers/infiniband/hw/hfi1/verbs.c        |  282 +
 drivers/infiniband/hw/hfi1/verbs.h        |  105 +
 drivers/infiniband/hw/hfi1/verbs_txreq.h  |   11 
 drivers/infiniband/hw/hfi1/vnic_sdma.c    |   21 
 drivers/infiniband/hw/qib/qib_rc.c        |    7 
 drivers/infiniband/hw/qib/qib_verbs.c     |   11 
 drivers/infiniband/hw/qib/qib_verbs.h     |    6 
 drivers/infiniband/sw/rdmavt/qp.c         |   55 
 drivers/infiniband/sw/rdmavt/rc.c         |   19 
 include/rdma/ib_hdrs.h                    |   14 
 include/rdma/rdma_vt.h                    |   53 
 include/rdma/rdmavt_qp.h                  |   13 
 include/rdma/tid_rdma_defs.h              |  150 +
 include/uapi/rdma/hfi/hfi1_user.h         |    6 
 47 files changed, 10622 insertions(+), 625 deletions(-)
 create mode 100644 drivers/infiniband/hw/hfi1/iowait.c
 create mode 100644 drivers/infiniband/hw/hfi1/opfn.c
 create mode 100644 drivers/infiniband/hw/hfi1/opfn.h
 create mode 100644 drivers/infiniband/hw/hfi1/rc.h
 create mode 100644 drivers/infiniband/hw/hfi1/tid_rdma.c
 create mode 100644 drivers/infiniband/hw/hfi1/tid_rdma.h
 create mode 100644 drivers/infiniband/hw/hfi1/trace_iowait.h
 create mode 100644 drivers/infiniband/hw/hfi1/trace_tid.h
 create mode 100644 include/rdma/tid_rdma_defs.h

--
-Denny
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Photo]     [Yosemite News]     [Yosemite Photos]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux