Re: [PATCH v2 rdma-next 0/5] Introduce a DMA block iterator

Selvin Xavier <selvin.xavier@xxxxxxxxxxxx> · Tue, 23 Apr 2019 14:09:37 +0530

On Tue, Apr 23, 2019 at 12:13 AM Saleem, Shiraz <shiraz.saleem@xxxxxxxxx> wrote:
>
> >Subject: Re: [PATCH v2 rdma-next 0/5] Introduce a DMA block iterator
> >
> >On Fri, Apr 19, 2019 at 08:43:48AM -0500, Shiraz Saleem wrote:
> >> From: "Shiraz Saleem" <shiraz.saleem@xxxxxxxxx>
> >>
> >> This patch set is aiming to allow drivers to leverage a new DMA block
> >> iterator to get contiguous aligned memory blocks within their HW
> >> supported page sizes. The motivation for this work comes from the
> >> discussion in [1].
> >>
> >> The first patch introduces a new umem API that allows drivers to find
> >> a best supported page size to use for the MR, from a bitmap of HW
> >> supported page sizes.
> >>
> >> The second patch introduces a new DMA block iterator that returns
> >> allows drivers to get aligned DMA addresses within a HW supported page size.
> >>
> >> The third patch and fouth patch removes the dependency of i40iw and
> >> bnxt_re drivers on the hugetlb flag. The new core APIs are called in
> >> these drivers to get huge page size aligned addresses if the MR is backed by
> >huge pages.
> >>
> >> The sixth patch removes the hugetlb flag from IB core.
> >>
> >> Please note that mixed page portion of the algorithm and bnxt_re
> >> update in patch #4 have not been tested on hardware.
> >>
> >> [1] https://patchwork.kernel.org/patch/10499753/
> >>
> >> RFC-->v0:
> >> * Add to scatter table by iterating a limited sized page list.
> >> * Updated driver call sites to use the for_each_sg_page iterator
> >>   variant where applicable.
> >> * Tweaked algorithm in ib_umem_find_single_pg_size and
> >ib_umem_next_phys_iter
> >>   to ignore alignment of the start of first SGE and end of the last SGE.
> >> * Simplified ib_umem_find_single_pg_size on offset alignments checks for
> >>   user-space virtual and physical buffer.
> >> * Updated ib_umem_start_phys_iter to do some pre-computation
> >>   for the non-mixed page support case.
> >> * Updated bnxt_re driver to use the new core APIs and remove its
> >>   dependency on the huge tlb flag.
> >> * Fixed a bug in computation of sg_phys_iter->phyaddr in
> >ib_umem_next_phys_iter.
> >> * Drop hugetlb flag usage from RDMA subsystem.
> >> * Rebased on top of for-next.
> >>
> >> v0-->v1:
> >> * Remove the patches that update driver to use for_each_sg_page variant
> >>   to iterate in the SGE. This is sent as a seperate series using
> >>   the for_each_sg_dma_page variant.
> >> * Tweak ib_umem_add_sg_table API defintion based on maintainer feedback.
> >> * Cache number of scatterlist entries in umem.
> >> * Update function headers for ib_umem_find_single_pg_size and
> >ib_umem_next_phys_iter.
> >> * Add sanity check on supported_pgsz in ib_umem_find_single_pg_size.
> >>
> >> v1-->v2:
> >> *Removed page combining patch as it was sent stand alone.
> >> *__fls on pgsz_bitmap as opposed to fls64 since it's an unsigned long.
> >> *rename ib_umem_find_pg_bit() --> rdma_find_pg_bit() and moved to
> >> ib_verbs.h *rename ib_umem_find_single_pg_size() -->
> >> ib_umem_find_best_pgsz() *New flag IB_UMEM_VA_BASED_OFFSET for
> >ib_umem_find_best_pgsz API for HW that uses least significant bits
> >>   of VA to indicate start offset into DMA list.
> >> *rdma_find_pg_bit() logic is re-written and simplified. It can support input of 0 or 1
> >dma addr cases.
> >> *ib_umem_find_best_pgsz() optimized to be less computationally expensive
> >running rdma_find_pg_bit() only once.
> >> *rdma_for_each_block() is the new re-designed DMA block iterator which is more
> >in line with for_each_sg_dma_page()iterator.
> >> *rdma_find_mixed_pg_bit() logic for interior SGE's accounting for start and end
> >dma address.
> >> *remove i40iw specific enums for supported page size *remove vma_list
> >> form ib_umem_get()
> >
> >Gal? Does this work for you now?
> >
> >At this point this only impacts two drivers that are presumably tested by their
> >authors, so I'd like to merge it to finally get rid of the hugetlb flag.. But EFA will
> >need to use it too
> >
>
> Selvin - It would be good if you could retest this version of the series with bnxt_re too since
> there were some design changes to core algorithms.
>
> Shiraz
>
>

Series tested with bnxt_re. Looks good with my testing.

Tested-by: Selvin Xavier <selvin.xavier@xxxxxxxxxxxx>