RE: [PATCH v2 rdma-next 0/5] Introduce a DMA block iterator

"Saleem, Shiraz" <shiraz.saleem@xxxxxxxxx> · Mon, 22 Apr 2019 18:43:04 +0000

>Subject: Re: [PATCH v2 rdma-next 0/5] Introduce a DMA block iterator
>
>On Fri, Apr 19, 2019 at 08:43:48AM -0500, Shiraz Saleem wrote:
>> From: "Shiraz Saleem" <shiraz.saleem@xxxxxxxxx>
>>
>> This patch set is aiming to allow drivers to leverage a new DMA block
>> iterator to get contiguous aligned memory blocks within their HW
>> supported page sizes. The motivation for this work comes from the
>> discussion in [1].
>>
>> The first patch introduces a new umem API that allows drivers to find
>> a best supported page size to use for the MR, from a bitmap of HW
>> supported page sizes.
>>
>> The second patch introduces a new DMA block iterator that returns
>> allows drivers to get aligned DMA addresses within a HW supported page size.
>>
>> The third patch and fouth patch removes the dependency of i40iw and
>> bnxt_re drivers on the hugetlb flag. The new core APIs are called in
>> these drivers to get huge page size aligned addresses if the MR is backed by
>huge pages.
>>
>> The sixth patch removes the hugetlb flag from IB core.
>>
>> Please note that mixed page portion of the algorithm and bnxt_re
>> update in patch #4 have not been tested on hardware.
>>
>> [1] https://patchwork.kernel.org/patch/10499753/
>>
>> RFC-->v0:
>> * Add to scatter table by iterating a limited sized page list.
>> * Updated driver call sites to use the for_each_sg_page iterator
>>   variant where applicable.
>> * Tweaked algorithm in ib_umem_find_single_pg_size and
>ib_umem_next_phys_iter
>>   to ignore alignment of the start of first SGE and end of the last SGE.
>> * Simplified ib_umem_find_single_pg_size on offset alignments checks for
>>   user-space virtual and physical buffer.
>> * Updated ib_umem_start_phys_iter to do some pre-computation
>>   for the non-mixed page support case.
>> * Updated bnxt_re driver to use the new core APIs and remove its
>>   dependency on the huge tlb flag.
>> * Fixed a bug in computation of sg_phys_iter->phyaddr in
>ib_umem_next_phys_iter.
>> * Drop hugetlb flag usage from RDMA subsystem.
>> * Rebased on top of for-next.
>>
>> v0-->v1:
>> * Remove the patches that update driver to use for_each_sg_page variant
>>   to iterate in the SGE. This is sent as a seperate series using
>>   the for_each_sg_dma_page variant.
>> * Tweak ib_umem_add_sg_table API defintion based on maintainer feedback.
>> * Cache number of scatterlist entries in umem.
>> * Update function headers for ib_umem_find_single_pg_size and
>ib_umem_next_phys_iter.
>> * Add sanity check on supported_pgsz in ib_umem_find_single_pg_size.
>>
>> v1-->v2:
>> *Removed page combining patch as it was sent stand alone.
>> *__fls on pgsz_bitmap as opposed to fls64 since it's an unsigned long.
>> *rename ib_umem_find_pg_bit() --> rdma_find_pg_bit() and moved to
>> ib_verbs.h *rename ib_umem_find_single_pg_size() -->
>> ib_umem_find_best_pgsz() *New flag IB_UMEM_VA_BASED_OFFSET for
>ib_umem_find_best_pgsz API for HW that uses least significant bits
>>   of VA to indicate start offset into DMA list.
>> *rdma_find_pg_bit() logic is re-written and simplified. It can support input of 0 or 1
>dma addr cases.
>> *ib_umem_find_best_pgsz() optimized to be less computationally expensive
>running rdma_find_pg_bit() only once.
>> *rdma_for_each_block() is the new re-designed DMA block iterator which is more
>in line with for_each_sg_dma_page()iterator.
>> *rdma_find_mixed_pg_bit() logic for interior SGE's accounting for start and end
>dma address.
>> *remove i40iw specific enums for supported page size *remove vma_list
>> form ib_umem_get()
>
>Gal? Does this work for you now?
>
>At this point this only impacts two drivers that are presumably tested by their
>authors, so I'd like to merge it to finally get rid of the hugetlb flag.. But EFA will
>need to use it too
>

Selvin - It would be good if you could retest this version of the series with bnxt_re too since
there were some design changes to core algorithms.

Shiraz