From: "Saleem, Shiraz" <shiraz.saleem@xxxxxxxxx> This patch set is aiming to allow drivers that support multiple page sizes to leverage the core umem APIs to obtain suitable HW DMA addresses for the MR, aligned to a supported page size. The APIs accomodates for HW that support single page size or mixed page sizes in an MR. The motivation for this work comes from the discussion in [1]. The first patch modifies current memory registration API ib_umem_get() to combine contiguous regions into SGEs and add them to the scatter table. The second patch introduces a new core API that allows drivers to find the best supported page size to use for this MR, from a bitmap of HW supported page sizes. The third patch introduces new core APIs that iterates through the SG list and returns contiguous memory blocks aligned to a HW supported page size. The fourth patch and fifth patch removes the dependency of i40iw and bnxt_re drivers on the hugetlb flag. The new core APIs are called in these drivers to get huge page size aligned addresses if the MR is backed by huge pages. The sixth patch removes the hugetlb flag from IB core. Please note that mixed page portion of the algorithm and bnxt_re update in patch #5 have not been tested on hardware. [1] https://patchwork.kernel.org/patch/10499753/ RFC-->v0: --------- * Add to scatter table by iterating a limited sized page list. * Updated driver call sites to use the for_each_sg_page iterator variant where applicable. * Tweaked algorithm in ib_umem_find_single_pg_size and ib_umem_next_phys_iter to ignore alignment of the start of first SGE and end of the last SGE. * Simplified ib_umem_find_single_pg_size on offset alignments checks for user-space virtual and physical buffer. * Updated ib_umem_start_phys_iter to do some pre-computation for the non-mixed page support case. * Updated bnxt_re driver to use the new core APIs and remove its dependency on the huge tlb flag. * Fixed a bug in computation of sg_phys_iter->phyaddr in ib_umem_next_phys_iter. * Drop hugetlb flag usage from RDMA subsystem. * Rebased on top of for-next. v0-->v1: -------- * Remove the patches that update driver to use for_each_sg_page variant to iterate in the SGE. This is sent as a seperate series using the for_each_sg_dma_page variant. * Tweak ib_umem_add_sg_table API defintion based on maintainer feedback. * Cache number of scatterlist entries in umem. * Update function headers for ib_umem_find_single_pg_size and ib_umem_next_phys_iter. * Add sanity check on supported_pgsz in ib_umem_find_single_pg_size. Shiraz Saleem (6): RDMA/umem: Combine contiguous PAGE_SIZE regions in SGEs RDMA/umem: Add API to find best driver supported page size in an MR RDMA/umem: Add API to return aligned memory blocks from SGL RDMA/i40iw: Use umem API to retrieve aligned DMA address RDMA/bnxt_re: Use umem APIs to retrieve aligned DMA address RDMA/umem: Remove hugetlb flag drivers/infiniband/core/umem.c | 281 +++++++++++++++++++++++++++--- drivers/infiniband/core/umem_odp.c | 3 - drivers/infiniband/hw/bnxt_re/ib_verbs.c | 28 ++- drivers/infiniband/hw/i40iw/i40iw_user.h | 5 + drivers/infiniband/hw/i40iw/i40iw_verbs.c | 49 +----- drivers/infiniband/hw/i40iw/i40iw_verbs.h | 3 +- include/rdma/ib_umem.h | 50 +++++- 7 files changed, 319 insertions(+), 100 deletions(-) -- 1.8.3.1