在 2024/9/4 16:27, Edward Srouji 写道:
On 9/4/2024 9:02 AM, Zhu Yanjun wrote:
External email: Use caution opening links or attachments
在 2024/9/3 19:37, Leon Romanovsky 写道:
From: Leon Romanovsky <leonro@xxxxxxxxxx>
Hi,
This series from Edward introduces mlx5 data direct placement (DDP)
feature.
This feature allows WRs on the receiver side of the QP to be consumed
out of order, permitting the sender side to transmit messages without
guaranteeing arrival order on the receiver side.
When enabled, the completion ordering of WRs remains in-order,
regardless of the Receive WRs consumption order.
RDMA Read and RDMA Atomic operations on the responder side continue to
be executed in-order, while the ordering of data placement for RDMA
Write and Send operations is not guaranteed.
It is an interesting feature. If I got this feature correctly, this
feature permits the user consumes the data out of order when RDMA Write
and Send operations. But its completiong ordering is still in order.
Correct.
Any scenario that this feature can be applied and what benefits will be
got from this feature?
I am just curious about this. Normally the users will consume the data
in order. In what scenario, the user will consume the data out of order?
One of the main benefits of this feature is achieving higher bandwidth
(BW) by allowing
responders to receive packets out of order (OOO).
For example, this can be utilized in devices that support multi-plane
functionality,
as introduced in the "Multi-plane support for mlx5" series [1]. When
mlx5 multi-plane
is supported, a single logical mlx5 port aggregates multiple physical
plane ports.
In this scenario, the requester can "spray" packets across the
multiple physical
plane ports without guaranteeing packet order, either on the wire or
on the receiver
(responder) side.
With this approach, no barriers or fences are required to ensure
in-order packet
reception, which optimizes the data path for performance. This can
result in better
BW, theoretically achieving line-rate performance equivalent to the
sum of
the maximum BW of all physical plane ports, with only one QP.
Thanks a lot for your quick reply. Without ensuring in-order packet
reception, this does optimize the data path for performance.
I agree with you.
But how does the receiver get the correct packets from the out-of-order
packets efficiently?
The method is implemented in Software or Hardware?
I am just interested in this feature and want to know more about this.
Thanks,
Zhu Yanjun
[1] https://lore.kernel.org/lkml/cover.1718553901.git.leon@xxxxxxxxxx/
Thanks,
Zhu Yanjun
Thanks
Edward Srouji (2):
net/mlx5: Introduce data placement ordering bits
RDMA/mlx5: Support OOO RX WQE consumption
drivers/infiniband/hw/mlx5/main.c | 8 +++++
drivers/infiniband/hw/mlx5/mlx5_ib.h | 1 +
drivers/infiniband/hw/mlx5/qp.c | 51
+++++++++++++++++++++++++---
include/linux/mlx5/mlx5_ifc.h | 24 +++++++++----
include/uapi/rdma/mlx5-abi.h | 5 +++
5 files changed, 78 insertions(+), 11 deletions(-)
--
Best Regards,
Yanjun.Zhu