在 2024/9/4 23:30, Michael Guralnik 写道:
This series introduces a new ODP scheme in mlx5 where the FW takes the responsibility of parsing and providing page fault data to the driver to handle the fault. As opposed to the current ODP transport scheme where the driver is responsible for reading and parsing work queues and querying mkeys to acquire needed info to handle the page fault. The new scheme allows driver to support ODP over Devx QPs where driver is not able to access the QP buffers, owned by the user application, to read the work queue requests. Furthermore, the new scheme allows support for ODP with new indirect MKEY types as the driver doesn't need to query or parse indirect mkeys in this scheme. The driver will enable the new scheme on devices that have the relevant capabilities. Otherwise, transport scheme ODP will be the default. The move to memory scheme ODP is transparent to existing ODP applications and no change is needed. New application that want to take advantage of the new functionality should query which scheme is active and it's capabilities using Devx.
On-Demand-Paging (ODP) is a technique to alleviate much of the shortcomings of memory registration. Applications no longer need to pin down the underlying physical pages of the address space, and track the validity of the mappings. Rather, the HCA requests the latest translations from the OS when pages are not present, and the OS invalidates translations which are no longer valid due to either non-present pages or mapping changes.
As such, it seems that it can save memory via not pinning down the underlying physical pages of the address space, and track the validity of the mappings.
What is the difference on the performance with/without ODP enabled? And about memory usage, is there any test result about this?
And ODP can be used mlx5 IB device? Or ODP can only be used in mlx5 RoCEv2 device?
Thanks, Zhu Yanjun
Michael Guralnik (8): net/mlx5: Expand mkey page size to support 6 bits net/mlx5: Expose HW bits for Memory scheme ODP RDMA/mlx5: Add new ODP memory scheme eqe format RDMA/mlx5: Enforce umem boundaries for explicit ODP page faults RDMA/mlx5: Split ODP mkey search logic RDMA/mlx5: Add handling for memory scheme page fault events RDMA/mlx5: Add implicit MR handling to ODP memory scheme net/mlx5: Handle memory scheme ODP capabilities drivers/infiniband/hw/mlx5/mlx5_ib.h | 17 +- drivers/infiniband/hw/mlx5/mr.c | 10 +- drivers/infiniband/hw/mlx5/odp.c | 400 ++++++++++++++---- .../net/ethernet/mellanox/mlx5/core/main.c | 54 ++- include/linux/mlx5/device.h | 30 +- include/linux/mlx5/mlx5_ifc.h | 64 ++- 6 files changed, 449 insertions(+), 126 deletions(-)