On Tue, 27 Oct 2020 at 03:41, Christian König <ckoenig.leichtzumerken@xxxxxxxxx> wrote: > > This replaces the spaghetti code in the two existing page pools. > > First of all depending on the allocation size it is between 3 (1GiB) and > 5 (1MiB) times faster than the old implementation. > > It makes better use of buddy pages to allow for larger physical contiguous > allocations which should result in better TLB utilization at least for > amdgpu. > > Instead of a completely braindead approach of filling the pool with one > CPU while another one is trying to shrink it we only give back freed > pages. > > This also results in much less locking contention and a trylock free MM > shrinker callback, so we can guarantee that pages are given back to the > system when needed. > > Downside of this is that it takes longer for many small allocations until > the pool is filled up. We could address this, but I couldn't find an use > case where this actually matters. We also don't bother freeing large > chunks of pages any more since the CPU overhead in that path isn't really > that important. > > The sysfs files are replaced with a single module parameter, allowing > users to override how many pages should be globally pooled in TTM. This > unfortunately breaks the UAPI slightly, but as far as we know nobody ever > depended on this. > > Zeroing memory coming from the pool was handled inconsistently. The > alloc_pages() based pool was zeroing it, the dma_alloc_attr() based one > wasn't. For now the new implementation isn't zeroing pages from the pool > either and only sets the __GFP_ZERO flag when necessary. > > The implementation has only 768 lines of code compared to the over 2600 > of the old one, and also allows for saving quite a bunch of code in the > drivers since we don't need specialized handling there any more based on > kernel config. > > Additional to all of that there was a neat bug with IOMMU, coherent DMA > mappings and huge pages which is now fixed in the new code as well. > > v2: make ttm_pool_apply_caching static as reported by the kernel bot, add > some more checks #86: FILE: drivers/gpu/drm/ttm/ttm_memory.c:457: + ttm_pool_mgr_init(glob->zone_kernel->max_mem/(2*PAGE_SIZE)); ^ -:86: CHECK:SPACING: spaces preferred around that '*' (ctx:VxV) #86: FILE: drivers/gpu/drm/ttm/ttm_memory.c:457: + ttm_pool_mgr_init(glob->zone_kernel->max_mem/(2*PAGE_SIZE)); ^ -:619: CHECK:BRACES: Blank lines aren't necessary before a close brace '}' #619: FILE: drivers/gpu/drm/ttm/ttm_pool.c:516: + +} -:845: CHECK:UNCOMMENTED_DEFINITION: spinlock_t definition without comment #845: FILE: include/drm/ttm/ttm_pool.h:55: + spinlock_t lock; Would be good to get those cleaned up, otherwise Reviewed-by: Dave Airlie <airlied@xxxxxxxxxx> for the series. Dave. _______________________________________________ dri-devel mailing list dri-devel@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/dri-devel