From: Leon Romanovsky <leonro@xxxxxxxxxxxx> Changelog v0->v1: * Fixed commit message in patch 2 * Removed redundant brackets * Add FIXME comment * Flush workqueue to ensure no work is executed during ib_device dereg * Change declaration of sg_list ot be flex array * Fix rebase error --------------------- Hi, In this series from Moni, we are implementing the new advise_mr() verb, which was proposed as RFC [1]. The verb advise_mr() borrows its definition from the system call madvise() by giving an advice to the driver about an address range that belongs to a memory region (MR), in opposite to madvise() which operates on addresses and has different logical semantics not suitable for MRs. This verb is used by applications to tell the kernel about expected memory usage to efficiently prepare it in advance, prior any following usage. Like with madvise(), the advise_mr verb does not interfere the semantics of the application, but can improve application performance. Being an advice, the kernel is free to ignore advise_mr() calls. Important example of such performance improvement hint is partial pre-fetching of an ODP MRs. Such pre-fetched ODP address ensure that range is exist before the actual IO is conducted. This would provide a way to reduce latency by overlapping paging-in and either compute time or IO to other ranges. Thanks [1] https://www.spinics.net/lists/linux-rdma/msg70592.html --- This series has merge conflict with commit: 4d5422a309de ("IB/mlx5: Skip non-ODP MR when handling a page fault") in rdma-rc. The resolution is as follow: 1. It is an error to ask "prefetch" for non-ODP MRs, because it came from explicit request. 2. It is OK to have non-ODP MRs in page-faults. + if (prefetch && !mr->umem->is_odp) { + ret = -EINVAL; + goto srcu_unlock; + } + + if (!mr->umem->is_odp) { + mlx5_ib_dbg(dev, "skipping non ODP MR (lkey=0x%06x) in page fault handler.\n", + key); + if (bytes_mapped) + *bytes_mapped += bcnt; + ret = 0; + goto srcu_unlock; + } Moni Shoua (3): IB/uverbs: Add helper to get array size from ptr attribute IB/uverbs: Add support to advise_mr IB/mlx5: Add advise_mr() support drivers/infiniband/core/uverbs_std_types_mr.c | 56 ++++++++ drivers/infiniband/hw/mlx5/flow.c | 12 +- drivers/infiniband/hw/mlx5/main.c | 8 ++ drivers/infiniband/hw/mlx5/mlx5_ib.h | 18 +++ drivers/infiniband/hw/mlx5/mr.c | 15 +++ drivers/infiniband/hw/mlx5/odp.c | 120 ++++++++++++++++-- include/rdma/ib_verbs.h | 6 + include/rdma/uverbs_ioctl.h | 23 ++++ include/uapi/rdma/ib_user_ioctl_cmds.h | 8 ++ include/uapi/rdma/ib_user_ioctl_verbs.h | 9 ++ 10 files changed, 259 insertions(+), 16 deletions(-) -- 2.19.1