Re: [PATCH RFC 05/12] RDMA/cxgb4: Use for_each_sg_dma_page iterator on umem SGL

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Jan 28, 2019 at 12:45:26PM -0600, Steve Wise wrote:
> 
> On 1/28/2019 12:29 PM, Jason Gunthorpe wrote:
> > On Sat, Jan 26, 2019 at 11:09:45AM -0600, Steve Wise wrote:
> >>
> >>> From: linux-rdma-owner@xxxxxxxxxxxxxxx <linux-rdma-
> >>> owner@xxxxxxxxxxxxxxx> On Behalf Of Shiraz Saleem
> >>> Sent: Saturday, January 26, 2019 10:59 AM
> >>> To: dledford@xxxxxxxxxx; jgg@xxxxxxxx; linux-rdma@xxxxxxxxxxxxxxx
> >>> Cc: Shiraz, Saleem <shiraz.saleem@xxxxxxxxx>; Steve Wise
> >>> <swise@xxxxxxxxxxx>
> >>> Subject: [PATCH RFC 05/12] RDMA/cxgb4: Use for_each_sg_dma_page
> >>> iterator on umem SGL
> >>>
> >>> From: "Shiraz, Saleem" <shiraz.saleem@xxxxxxxxx>
> >>>
> >>> Use the for_each_sg_dma_page iterator variant to walk the umem
> >>> DMA-mapped SGL and get the page DMA address. This avoids the extra
> >>> loop to iterate pages in the SGE when for_each_sg iterator is used.
> >>>
> >>> Additionally, purge umem->page_shift usage in the driver
> >>> as its only relevant for ODP MRs. Use system page size and
> >>> shift instead.
> >> Hey Shiraz, Doesn't umem->page_shift allow registering huge pages
> >> efficiently?  IE is umem->page_shift set for the 2MB shift if the memory in
> >> this umem region is from the 2MB huge page pool? 
> > I think long ago this might have been some feavered dream, but it was
> > never implemented and never made any sense. How would the core code
> > know it driver supported the CPU's huge page size?
> >
> > Shiraz's version to ineject huge pages into the driver is much better
> 
> The driver advertises the "page sizes" it supports for MR PBLs
> (ib_device_attr.page_size_cap).  For example, cxgb4 hw supports 4K up to
> 128MB.  So if a umem was composed of only huge pages, then the reg code
> could pick a page size that is as big as the huge page size or up to the
> device max supported page size, thus reducing the PBL depth for a given
> MR.  

> There was code for this once upon a time, I thought.  Perhaps it
> was never upstreamed or it was rejected.

I don't know, nothing in the kernel uses page_size_cap, it is just
flowed to verbs.

I still think Shiraz's version is cleaner, having the driver break up
and iterate over maximally sized sgl entries makes sense to me. This
way we get the best efficiency on the iommu setup path. Esepcially if
we have drivers with different requirements for start/end alignment -
which hasn't been entirely investigated yet...

Jason



[Index of Archives]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Photo]     [Yosemite News]     [Yosemite Photos]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux