Re: [PATCH RFC 3/4] RDMA/umem: Add API to return optimal HW DMA addresses from SG list

Shiraz Saleem <shiraz.saleem@xxxxxxxxx> · Tue, 30 Oct 2018 18:25:25 -0500

On Thu, Oct 25, 2018 at 08:40:29PM -0600, Jason Gunthorpe wrote:
> On Thu, Oct 25, 2018 at 05:20:43PM -0500, Shiraz Saleem wrote:
> > > > +
> > > > +/**
> > > > + * ib_umem_next_phys_iter - SG list iterator that returns aligned HW address
> > > > + * @umem: umem struct
> > > > + * @sg_phys_iter: SG HW address iterator
> > > > + * @supported_pgsz: bitmask of HW supported page sizes
> > > > + *
> > > > + * This helper iterates over the SG list and returns the HW
> > > > + * address aligned to a supported HW page size.
> > > > + *
> > > > + * The algorithm differs slightly between HW that supports single
> > > > + * page sizes vs mixed page sizes in an MR. For example, if an
> > > > + * MR of size 4M-4K, starts at an offset PAGE_SIZE (ex: 4K) into
> > > > + * a 2M page; HW that supports multiple page sizes (ex: 4K, 2M)
> > > > + * would get 511 4K pages and one 2M page. Single page support
> > > > + * HW would get back two 2M pages or 1023 4K pages.
> > > 
> > > That doesn't seem quite right, the first and last pages should always
> > > be the largest needed to get to alignment, as HW always has an
> > > offset/length sort of scheme to make this efficient.
> > > 
> > > So I would expect 4M-4k to always return two 2M pages if the HW
> > > supports 2M, even if supports smaller sizes.
> > > 
> > > For this to work the algorithm needs to know start/end...
> > > 
> > > Maybe we should not support multiple page sizes yet, do we have HW
> > > that can even do that?
> > 
> > 
> > Sorry. I think my comment is confusing.
> > There is a typo in my description. Its not multiple page size but mixed
> > page size. In which case, I think it should be 511 4K and one 2M page
> > for my example of 4M-4k. This is how I understood it based on your
> > description here.
> > https://patchwork.kernel.org/patch/10499753/#22186847
> 
> I depends if the HW can do the start/end offset thing, ie if you can
> start and end on a partial page size.
> 
> Since IB requires this I assume that all HW can do it for all page
> sizes, and we can pretty much ignore the alignment of the start of the
> first sgl and the end of the last sgl.
> 

OK. I will tweak the algorithm accordingly.