Re: [Intel-gfx] [PATCH rdma-next v3 1/2] lib/scatterlist: Add support in dynamic allocation of SG table from pages

Jason Gunthorpe <jgg@xxxxxxxxxx> · Fri, 25 Sep 2020 09:34:10 -0300

On Fri, Sep 25, 2020 at 01:29:49PM +0100, Tvrtko Ursulin wrote:
> 
> On 25/09/2020 12:58, Jason Gunthorpe wrote:
> > On Fri, Sep 25, 2020 at 12:41:29PM +0100, Tvrtko Ursulin wrote:
> > > 
> > > On 25/09/2020 08:13, Leon Romanovsky wrote:
> > > > On Thu, Sep 24, 2020 at 09:21:20AM +0100, Tvrtko Ursulin wrote:
> > > > > 
> > > > > On 22/09/2020 09:39, Leon Romanovsky wrote:
> > > > > > From: Maor Gottlieb <maorg@xxxxxxxxxxxx>
> > > > > > 
> > > > > > Extend __sg_alloc_table_from_pages to support dynamic allocation of
> > > > > > SG table from pages. It should be used by drivers that can't supply
> > > > > > all the pages at one time.
> > > > > > 
> > > > > > This function returns the last populated SGE in the table. Users should
> > > > > > pass it as an argument to the function from the second call and forward.
> > > > > > As before, nents will be equal to the number of populated SGEs (chunks).
> > > > > 
> > > > > So it's appending and growing the "list", did I get that right? Sounds handy
> > > > > indeed. Some comments/questions below.
> > > > 
> > > > Yes, we (RDMA) use this function to chain contiguous pages.
> > > 
> > > I will eveluate if i915 could start using it. We have some loops which build
> > > page by page and coalesce.
> > 
> > Christoph H doesn't like it, but if there are enough cases we should
> > really have a pin_user_pages_to_sg() rather than open code this all
> > over the place.
> > 
> > With THP the chance of getting a coalescing SG is much higher, and
> > everything is more efficient with larger SGEs.
> 
> Right, I was actually referring to i915 sites where we build sg tables out
> of shmem and plain kernel pages. In those areas we have some open coded
> coalescing loops (see for instance our shmem_get_pages). Plus a local "trim"
> to discard the unused entries, since we allocate pessimistically not knowing
> how coalescing will pan out. This kind of core function which appends pages
> could replace some of that. Maybe it would be slightly less efficient but I
> will pencil in to at least evaluate it.
> 
> Otherwise I do agree that coalescing is a win and in the past I have
> measured savings in a few MiB range just for struct scatterlist storage.

I think the eventual dream is to have a pin_user_pages_bvec or similar
that is integrated into the GUP logic so avoids all the extra work,
just allocates pages of bvecs on the fly. No extra step through a
linear array of page *'s

Starting to structuring things to take advantage of that makes some
sense

Jason