Re: [RFC PATCH 1/3] hugetlb: skip to end of PT page mapping when pte not present

Mike Kravetz <mike.kravetz@xxxxxxxxxx> · Tue, 31 May 2022 10:00:21 -0700

On 5/30/22 12:56, Peter Xu wrote:
> Hi, Mike,
> 
> On Fri, May 27, 2022 at 03:58:47PM -0700, Mike Kravetz wrote:
>> +unsigned long hugetlb_mask_last_hp(struct hstate *h)
>> +{
>> +	unsigned long hp_size = huge_page_size(h);
>> +
>> +	if (hp_size == P4D_SIZE)
>> +		return PGDIR_SIZE - P4D_SIZE;
>> +	else if (hp_size == PUD_SIZE)
>> +		return P4D_SIZE - PUD_SIZE;
>> +	else if (hp_size == PMD_SIZE)
>> +		return PUD_SIZE - PMD_SIZE;
>> +
>> +	return ~(0);
>> +}
> 
> How about:
> 
> unsigned long hugetlb_mask_last_hp(struct hstate *h)
> {
> 	unsigned long hp_size = huge_page_size(h);
> 
> 	return hp_size * (PTRS_PER_PTE - 1);
> }
> 
> ?
> 
> This is definitely a good idea, though I'm wondering the possibility to go
> one step further to make hugetlb pgtable walk just like the normal pages.
> 
> Say, would it be non-trivial to bring some of huge_pte_offset() into the
> walker functions, so that we can jump over even larger than PTRS_PER_PTE
> entries (e.g. when p4d==NULL for 2m huge pages)?  It's very possible I
> overlooked something, though.

Thanks Peter!

I did think of that as well.  But, I mostly wanted to throw out this simple
code while the idea of optimizations for sparse address range traversing was
fresh in my mind.

I'll take a closer look and see if we can use those general walker routines.
If we can, it would be great.
-- 
Mike Kravetz