On Tue, Dec 08, 2020 at 05:28:58PM +0000, Joao Martins wrote: > Much like hugetlbfs or THPs, we treat device pagemaps with > compound pages like the rest of GUP handling of compound pages. > > Rather than incrementing the refcount every 4K, we record > all sub pages and increment by @refs amount *once*. > > Performance measured by gup_benchmark improves considerably > get_user_pages_fast() and pin_user_pages_fast(): > > $ gup_benchmark -f /dev/dax0.2 -m 16384 -r 10 -S [-u,-a] -n 512 -w > > (get_user_pages_fast 2M pages) ~75k us -> ~3.6k us > (pin_user_pages_fast 2M pages) ~125k us -> ~3.8k us > > Signed-off-by: Joao Martins <joao.m.martins@xxxxxxxxxx> > mm/gup.c | 67 ++++++++++++++++++++++++++++++++++++++++++-------------- > 1 file changed, 51 insertions(+), 16 deletions(-) > > diff --git a/mm/gup.c b/mm/gup.c > index 98eb8e6d2609..194e6981eb03 100644 > +++ b/mm/gup.c > @@ -2250,22 +2250,68 @@ static int gup_pte_range(pmd_t pmd, unsigned long addr, unsigned long end, > } > #endif /* CONFIG_ARCH_HAS_PTE_SPECIAL */ > > + > +static int record_subpages(struct page *page, unsigned long addr, > + unsigned long end, struct page **pages) > +{ > + int nr; > + > + for (nr = 0; addr != end; addr += PAGE_SIZE) > + pages[nr++] = page++; > + > + return nr; > +} > + > #if defined(CONFIG_ARCH_HAS_PTE_DEVMAP) && defined(CONFIG_TRANSPARENT_HUGEPAGE) > -static int __gup_device_huge(unsigned long pfn, unsigned long addr, > - unsigned long end, unsigned int flags, > - struct page **pages, int *nr) > +static int __gup_device_compound_huge(struct dev_pagemap *pgmap, > + struct page *head, unsigned long sz, > + unsigned long addr, unsigned long end, > + unsigned int flags, struct page **pages) > +{ > + struct page *page; > + int refs; > + > + if (!(pgmap->flags & PGMAP_COMPOUND)) > + return -1; > + > + page = head + ((addr & (sz-1)) >> PAGE_SHIFT); All the places that call record_subpages do some kind of maths like this, it should be placed inside record_subpages and not opencoded everywhere. > + refs = record_subpages(page, addr, end, pages); > + > + SetPageReferenced(page); > + head = try_grab_compound_head(head, refs, flags); > + if (!head) { > + ClearPageReferenced(page); > + return 0; > + } > + > + return refs; > +} Why is all of this special? Any time we see a PMD/PGD/etc pointing to PFN we can apply this optimization. How come device has its own special path to do this?? Why do we need to check PGMAP_COMPOUND? Why do we need to get pgmap? (We already removed that from the hmm version of this, was that wrong? Is this different?) Dan? Also undo_dev_pagemap() is now out of date, we have unpin_user_pages() for that and no other error unwind touches ClearPageReferenced.. Basic idea is good though! Jason