Re: [PATCH v4 08/14] mm/gup: grab head page refcount once for group of subpages

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 10/8/21 12:54, Jason Gunthorpe wrote:
> On Fri, Aug 27, 2021 at 03:58:13PM +0100, Joao Martins wrote:
>> @@ -2252,16 +2265,25 @@ static int __gup_device_huge(unsigned long pfn, unsigned long addr,
>>  			ret = 0;
>>  			break;
>>  		}
>> -		SetPageReferenced(page);
>> -		pages[*nr] = page;
>> -		if (unlikely(!try_grab_page(page, flags))) {
>> -			undo_dev_pagemap(nr, nr_start, flags, pages);
>> +
>> +		head = compound_head(page);
>> +		/* @end is assumed to be limited at most one compound page */
>> +		if (PageHead(head))
>> +			next = end;
>> +		refs = record_subpages(page, addr, next, pages + *nr);
>> +
>> +		SetPageReferenced(head);
>> +		if (unlikely(!try_grab_compound_head(head, refs, flags))) {
> 
> I was thinking about this some more, and this ordering doesn't seem
> like a good idea. We shouldn't be looking at any part of the struct
> page without holding the refcount, certainly not the compound_head()
> 
> The only optimization that might work here is to grab the head, then
> compute the extent of tail pages and amalgamate them. Holding a ref on
> the head also secures the tails.
> 

How about pmd_page(orig) / pud_page(orig) like what the rest of hugetlb/thp
checks do? i.e. we would pass pmd_page(orig)/pud_page(orig) to __gup_device_huge()
as an added @head argument. While keeping the same structure of counting tail pages
between @addr .. @end if we have a head page.

Albeit this lingers on whether it's OK to call PageHead() .. The PageHead policy is for
any page (PF_ANY) so no hidden calls to compound_head() when testing that page flag. but
in the end it accesses struct page flags which is well, still struct page data.

We could also check pgmap for a non-zero geometry (which tells us that pmd_page(orig) does
represent a head page). And that would save us from looking at struct page data today, but
it would introduce problems later whenever we remove the pgmap ref grab in gup_device_huge().

So the only viable might be to do the grab, count tails and fixup-ref like you suggest
above, and take the perf hit of one extra atomic op :(

It's interesting how THP (in gup_huge_pmd()) unilaterally computes tails assuming
pmd_page(orig) is the head page.



[Index of Archives]     [Kernel Newbies]     [Security]     [Netfilter]     [Bugtraq]     [Linux FS]     [Yosemite Forum]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Samba]     [Video 4 Linux]     [Device Mapper]     [Linux Resources]

  Powered by Linux