On 10/8/21 12:54, Jason Gunthorpe wrote: > On Fri, Aug 27, 2021 at 03:58:13PM +0100, Joao Martins wrote: >> @@ -2252,16 +2265,25 @@ static int __gup_device_huge(unsigned long pfn, unsigned long addr, >> ret = 0; >> break; >> } >> - SetPageReferenced(page); >> - pages[*nr] = page; >> - if (unlikely(!try_grab_page(page, flags))) { >> - undo_dev_pagemap(nr, nr_start, flags, pages); >> + >> + head = compound_head(page); >> + /* @end is assumed to be limited at most one compound page */ >> + if (PageHead(head)) >> + next = end; >> + refs = record_subpages(page, addr, next, pages + *nr); >> + >> + SetPageReferenced(head); >> + if (unlikely(!try_grab_compound_head(head, refs, flags))) { > > I was thinking about this some more, and this ordering doesn't seem > like a good idea. We shouldn't be looking at any part of the struct > page without holding the refcount, certainly not the compound_head() > > The only optimization that might work here is to grab the head, then > compute the extent of tail pages and amalgamate them. Holding a ref on > the head also secures the tails. > How about pmd_page(orig) / pud_page(orig) like what the rest of hugetlb/thp checks do? i.e. we would pass pmd_page(orig)/pud_page(orig) to __gup_device_huge() as an added @head argument. While keeping the same structure of counting tail pages between @addr .. @end if we have a head page. Albeit this lingers on whether it's OK to call PageHead() .. The PageHead policy is for any page (PF_ANY) so no hidden calls to compound_head() when testing that page flag. but in the end it accesses struct page flags which is well, still struct page data. We could also check pgmap for a non-zero geometry (which tells us that pmd_page(orig) does represent a head page). And that would save us from looking at struct page data today, but it would introduce problems later whenever we remove the pgmap ref grab in gup_device_huge(). So the only viable might be to do the grab, count tails and fixup-ref like you suggest above, and take the perf hit of one extra atomic op :( It's interesting how THP (in gup_huge_pmd()) unilaterally computes tails assuming pmd_page(orig) is the head page.