Re: [PATCH v4 08/14] mm/gup: grab head page refcount once for group of subpages

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 9/28/21 19:01, Jason Gunthorpe wrote:
> On Thu, Sep 23, 2021 at 05:51:04PM +0100, Joao Martins wrote:
>> So ... if pgmap accounting was removed from gup-fast then this patch
>> would be a lot simpler and we could perhaps just fallback to the regular
>> hugepage case (THP, HugeTLB) like your suggestion at the top. See at the
>> end below scissors mark as the ballpark of changes.
>>
>> So far my options seem to be: 1) this patch which leverages the existing
>> iteration logic or 2) switching to for_each_compound_range() -- see my previous
>> reply 3) waiting for Dan to remove @pgmap accounting in gup-fast and use
>> something similar to below scissors mark.
>>
>> What do you think would be the best course of action?
> 
> I still think the basic algorithm should be to accumulate physicaly
> contiguous addresses when walking the page table and then flush them
> back to struct pages once we can't accumulate any more.
> 
> That works for both the walkers and all the page types?
> 

The logic already handles all page types -- I was trying to avoid the extra
complexity in regular hugetlb/THP path by not merging the handling of the
oddball case that is devmap (or fundamentally devmap
non-compound case in the future).

In the context of this patch I am think your suggestion that you  wrote
above to ... instead of changing __gup_device_huge() we uplevel/merge it
all in gup_huge_{pud,pmd}() to cover the devmap?

static int __gup_huge_range(orig_head, ...)
{
	...
	page = orig_head + ((addr & ~mask) >> PAGE_SHIFT);
	refs = record_subpages(page, addr, end, pages + *nr);

	head = try_grab_compound_head(orig_head, refs, flags);
	if (!head)
		return 0;

	if (unlikely(pud_val(orig) != pud_val(*pudp))) {
		put_compound_head(head, refs, flags);
		return 0;
	}

	SetPageReferenced(head);
	return 1;
}

static int gup_huge_pmd(...)
{
	...
	for_each_compound_range(index, page, npages, head, refs) {
		if (pud_devmap(orig))
			pgmap = get_dev_pagemap(pmd_pfn(orig), pgmap);
	
	
		if (!__gup_huge_page_range(pmd_page(orig), refs)) {
			undo_dev_pagemap(...);	
			return 0;
		}
	}

	put_dev_pagemap(pgmap);
}

But all this gup_huge_{pmd,pud}() rework is all just for the trouble of
trying to merge the basepage-on-PMD/PUD case of devmap. It feels more
complex (and affecting other page types) compared to leave the devmap
odity siloed like option 1. If the pgmap refcount wasn't there and
there was no users of basepages-on-PMD/PUD but just compound pages
on PMDs/PUDs ... then we would be talking about code removal rather
than added complexity. But I don't know how realistic that is for
other devmap users (beside device-dax).

> If the get_dev_pagemap has to remain then it just means we have to
> flush before changing pagemap pointers
Right -- I don't think we should need it as that discussion on the other
thread goes.

OTOH, using @pgmap might be useful to unblock gup-fast FOLL_LONGTERM
for certain devmap types[0] (like MEMORY_DEVICE_GENERIC [device-dax]
can support it but not MEMORY_DEVICE_FSDAX [fsdax]).

[0] https://lore.kernel.org/linux-mm/6a18179e-65f7-367d-89a9-d5162f10fef0@xxxxxxxxxx/



[Index of Archives]     [Kernel Newbies]     [Security]     [Netfilter]     [Bugtraq]     [Linux FS]     [Yosemite Forum]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Samba]     [Video 4 Linux]     [Device Mapper]     [Linux Resources]

  Powered by Linux