On Wed, Jan 02, 2019 at 04:51:24PM -0700, Jason Gunthorpe wrote: > On Mon, Dec 24, 2018 at 04:32:22PM -0600, Shiraz Saleem wrote: > > Combine contiguous regions of PAGE_SIZE pages > > into single scatter list entries while adding > > to the scatter table. This minimizes the number > > of the entries in the scatter list and reduces > > the DMA mapping overhead, particularly with the > > IOMMU. > > > > Additionally, update driver call sites to use the > > for_each_sg_page variant to iterate through the > > pages in the SGE. In certain call sites, there is > > the assumption of single SGE to page mapping and > > hence the iterator variant is required. In others, > > it is simpler to use the iterator variant. > > [..] > > static void __ib_umem_release(struct ib_device *dev, struct ib_umem *umem, int dirty) > > { > > - struct scatterlist *sg; > > + struct sg_page_iter sg_iter; > > struct page *page; > > - int i; > > + int nents = sg_nents(umem->sg_head.sgl); > > I wonder if we should cache sg_nents in the umem? We compute it twice > and since this now directly adds each sgl we know the nents > naturally.. Ah! So your saying, compute nents while we add to the scatter table and cache in umem as opposed to using this helper in both places? > > +/* ib_umem_add_sg_table - Add N contiguous pages to scatter table > > + * > > + * cur: current scatterlist entry > > + * nxt: next scatterlist entry > > + * page_list: array of npage struct page pointers > > + * npages: number of pages in page_list > > + */ > > +static void ib_umem_add_sg_table(struct scatterlist **cur, > > + struct scatterlist **nxt, > > + struct page **page_list, > > + unsigned long npages) > > I think this is more understandable as: > > struct scatterlist *ib_umem_add_sg_table(struct scatterlist *sgl, > struct page **page_list, > unsigned long npages, > bool first) > > Where first is set if sgl does not already point to a valid entry, and > the return result is new end of the list. > OK. I ll try to tweak the function accordingly. > > + /* Check if new page_list is contiguous with end of previous page_list*/ > > + if (*cur != *nxt) { > > + cur_sglen = (*cur)->length >> PAGE_SHIFT; > > + first_pfn = page_to_pfn(sg_page(*cur)); > > + if (first_pfn + cur_sglen == page_to_pfn(page_list[0])) > > + update_cur_sg = true; > > + } > > + > > + while (i != npages) { > > + unsigned long len; > > + struct page *first_page = page_list[i]; > > + > > + first_pfn = page_to_pfn(first_page); > > + > > + /* Compute the number of contiguous pages we have starting > > + * at i > > + */ > > + for (len = 0; i != npages && > > + first_pfn + len == page_to_pfn(page_list[i]); > > + i++, len++); > > The semicolon should be on a blank line alone for this construction OK. > > + > > + /* Squash first N contiguous pages from page_list into current > > + * SGE > > + */ > > + if (update_cur_sg) { > > + sg_set_page(*cur, sg_page(*cur), > > + (len + cur_sglen) * PAGE_SIZE, 0); > > This would be nicer as (*cur)->length instead of cur_sglen > > And if we are touching cur->length, it begs the question why not just > write cur->length += len << PAGE_SHIFT ? OK. > > + sg_mark_end(sg_cur); > > This seems like a good approach for now. We can look at trimming the > unused memory later.. Agreed.