This patch set is aiming to allow drivers that support multiple page sizes to leverage the core umem APIs to obtain suitable HW DMA addresses for the MR, aligned to a supported page size. The APIs accomodates for HW that support single page size or mixed page sizes in an MR. The motivation for this work comes from the discussion in [1]. The first patch modifies current memory registration API ib_umem_get() to combine contiguous regions into SGEs and add them to the scatter table. Driver call-sites are updated to use the for_each_sg_page iterator where applicable. The second patch introduces a new core API that allows drivers to find the best supported page size to use for this MR, from a bitmap of HW supported page sizes. The third patch introduces new core APIs that iterates through the SG list and returns suitable HW DMA addresses aligned to a driver supported page size. The fourth patch and fifth patch removes the dependency of i40iw and bnxt_re drivers on the hugetlb flag. The new core APIs are called in these drivers to get huge page size aligned addresses if the MR is backed by huge pages. The sixth patch removes the hugetlb flag from IB core. Please note that mixed page portion of the algorithm and bnxt_re update in patch #5 have not been tested on hardware. [1] https://patchwork.kernel.org/patch/10499753/ RFC-->v0: --------- * Add to scatter table by iterating a limited sized page list. * Updated driver call sites to use the for_each_sg_page iterator variant where applicable. * Tweaked algorithm in ib_umem_find_single_pg_size and ib_umem_next_phys_iter to ignore alignment of the start of first SGE and end of the last SGE. * Simplified ib_umem_find_single_pg_size on offset alignments checks for user-space virtual and physical buffer. * Updated ib_umem_start_phys_iter to do some pre-computation for the non-mixed page support case. * Updated bnxt_re driver to use the new core APIs and remove its dependency on the huge tlb flag. * Fixed a bug in computation of sg_phys_iter->phyaddr in ib_umem_next_phys_iter. * Drop hugetlb flag usage from RDMA subsystem. * Rebased on top of for-next. Shiraz Saleem (6): RDMA/umem: Combine contiguous PAGE_SIZE regions in SGEs RDMA/umem: Add API to find best driver supported page size in an MR RDMA/umem: Add API to return optimal HW DMA addresses from SG list RDMA/i40iw: Use umem API to retrieve optimal HW address RDMA/bnxt_re: Use umem APIs to retrieve optimal HW address RDMA/umem: Remove hugetlb flag drivers/infiniband/core/umem.c | 260 ++++++++++++++++++++++--- drivers/infiniband/core/umem_odp.c | 3 - drivers/infiniband/hw/bnxt_re/ib_verbs.c | 35 ++-- drivers/infiniband/hw/bnxt_re/qplib_res.c | 9 +- drivers/infiniband/hw/cxgb3/iwch_provider.c | 27 ++- drivers/infiniband/hw/cxgb4/mem.c | 31 ++- drivers/infiniband/hw/hns/hns_roce_hw_v1.c | 7 +- drivers/infiniband/hw/hns/hns_roce_hw_v2.c | 25 +-- drivers/infiniband/hw/hns/hns_roce_mr.c | 88 ++++----- drivers/infiniband/hw/i40iw/i40iw_user.h | 5 + drivers/infiniband/hw/i40iw/i40iw_verbs.c | 58 ++---- drivers/infiniband/hw/i40iw/i40iw_verbs.h | 3 +- drivers/infiniband/hw/mthca/mthca_provider.c | 33 ++-- drivers/infiniband/hw/nes/nes_verbs.c | 203 +++++++++---------- drivers/infiniband/hw/ocrdma/ocrdma_verbs.c | 54 +++-- drivers/infiniband/hw/qedr/verbs.c | 56 +++--- drivers/infiniband/hw/vmw_pvrdma/pvrdma_misc.c | 21 +- drivers/infiniband/sw/rdmavt/mr.c | 8 +- drivers/infiniband/sw/rxe/rxe_mr.c | 7 +- include/rdma/ib_umem.h | 33 +++- 20 files changed, 541 insertions(+), 425 deletions(-) -- 2.8.3