Re: drm/ttm: new TT backend allocation pool

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Oct 26, 2020 at 1:01 PM Christian König
<ckoenig.leichtzumerken@xxxxxxxxx> wrote:
>
> Am 26.10.20 um 06:23 schrieb Dave Airlie:
> > On Mon, 26 Oct 2020 at 01:41, Christian König
> > <ckoenig.leichtzumerken@xxxxxxxxx> wrote:
> >> This replaces the spaghetti code in the two existing page pools.
> >>
> >> First of all depending on the allocation size it is between 3 (1GiB) and
> >> 5 (1MiB) times faster than the old implementation.
> >>
> >> It makes better use of buddy pages to allow for larger physical contiguous
> >> allocations which should result in better TLB utilization at least for amdgpu.
> >>
> >> Instead of a completely braindead approach of filling the pool with one CPU
> >> while another one is trying to shrink it we only give back freed pages.
> >>
> >> This also results in much less locking contention and a trylock free MM
> >> shrinker callback, so we can guarantee that pages are given back to the system
> >> when needed.
> >>
> >> Downside of this is that it takes longer for many small allocations until the
> >> pool is filled up. We could address this, but I couldn't find an use case
> >> where this actually matters. And we don't bother freeing large chunks of pages
> >> any more.
> >>
> >> The sysfs files are replaced with a single module parameter, allowing users to
> >> override how many pages should be globally pooled in TTM. This unfortunately
> >> breaks the UAPI slightly, but as far as we know nobody ever depended on this.
> >>
> >> Zeroing memory coming from the pool was handled inconsistently. The
> >> alloc_pages() based pool was zeroing it, the dma_alloc_attr() based one wasn't.
> >> The new implementation isn't zeroing pages from the pool either and only sets
> >> the __GFP_ZERO flag when necessary.
> >>
> >> The implementation has only 753 lines of code compared to the over 2600 of the
> >> old one, and also allows for saving quite a bunch of code in the drivers since
> >> we don't need specialized handling there any more based on kernel config.
> >>
> >> Additional to all of that there was a neat bug with IOMMU, coherent DMA
> >> mappings and huge pages which is now fixed in the new code as well.
> >>
> >> Please review and comment,
> > Interesting, 5 doesn't have appeared to make on the list, but it
> > definitely has some checkpatch warnings. (indents, missing spaces
> > etc), Please clean those up.
>
> Well #5 is the really interesting one. Going to address the checkpatch
> warnings and send that out again with probably the first patch already
> pushed.
>
> >
> > some other random comments on it
> >
> > +       if (order) {
> > +               gfp_flags |= GFP_TRANSHUGE_LIGHT | __GFP_NORETRY |
> > +                       __GFP_KSWAPD_RECLAIM;
> > +               gfp_flags &= ~__GFP_MOVABLE;
> > +               gfp_flags &= ~__GFP_COMP;
> > +       }
> >
> > I'd like some explains of why these flags are chosen.
>
> Good question, I don't remember fully either and have just copied them
> over from the old allocator for the hugepage case.
>
> In general we just want a cheap try to allocate and fallback to 4k if we
> don't have enough free.
>
> Moveable and and compound don't work with how TTMs fault mechanism
> works, so that makes sense anyway.
>
> > Otherwise I'm pretty happy with the remaining patch in the series, it
> > ends up with a pretty nice cleanup.
> >
> > Note drm_gem_vram_helper fails to build (vmm->dev should be dev->dev possibly).
>
> Ok, going to double check that. What I've also noticed is that QXL
> doesn't seem to have a device structure at all.
>
> Going to support that in the new allocator as well.

Maybe I'm off, but for dma allocations there's drm_device->dma_dev. I
don't think it matters for any ttm using driver right now, but can't
hurt to be consistent on this.
-Daniel

>
> > Once you clean up checkpatch and make drm_gem_vram_helper build again.
> > For 5-13 Reviewed-by: Dave Airlie <airlied@xxxxxxxxxx>
> >
> > I've also boot tested this on nouveau and it survives the basics fine.
>
> Nice to know. I've tested quite a bit on amdgpu and smoke tested on
> radeon/nouveau.
>
> Going to give it a try on my AGP radeon as well when I have time.
>
> Christian.
>
> >
> > Dave.
> >
> >> Christian.
> >>
> >>
> >> _______________________________________________
> >> dri-devel mailing list
> >> dri-devel@xxxxxxxxxxxxxxxxxxxxx
> >> https://lists.freedesktop.org/mailman/listinfo/dri-devel
>
> _______________________________________________
> dri-devel mailing list
> dri-devel@xxxxxxxxxxxxxxxxxxxxx
> https://lists.freedesktop.org/mailman/listinfo/dri-devel



-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
dri-devel mailing list
dri-devel@xxxxxxxxxxxxxxxxxxxxx
https://lists.freedesktop.org/mailman/listinfo/dri-devel




[Index of Archives]     [Linux DRI Users]     [Linux Intel Graphics]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [XFree86]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Linux Kernel]     [Linux SCSI]     [XFree86]
  Powered by Linux