Re: [PATCH v3 0/3] A Solution to Re-enable hugetlb vmemmap optimize

Jane Chu <jane.chu@xxxxxxxxxx> · Wed, 7 Feb 2024 18:24:52 -0800

On 2/7/2024 6:17 AM, Matthew Wilcox wrote:

On Wed, Feb 07, 2024 at 12:11:25PM +0000, Will Deacon wrote:
On Wed, Feb 07, 2024 at 11:21:17AM +0000, Matthew Wilcox wrote:
The pte lock cannot be taken in irq context (which I think is what
you're asking?)  While it is not possible to reason about all users of
struct page, we are somewhat relieved of that work by noting that this is
only for hugetlbfs, so we don't need to reason about slab, page tables,
netmem or zsmalloc.
My concern is that an interrupt handler tries to access a 'struct page'
which faults due to another core splitting a pmd mapping for the vmemmap.
In this case, I think we'll end up trying to resolve the fault from irq
context, which will try to take the spinlock.
Yes, this absolutely can happen (with this patch), and this patch should
be dropped for now.

While this array of ~512 pages have been allocated to hugetlbfs, and one
would think that there would be no way that there could still be
references to them, another CPU can have a pointer to this struct page
(eg attempting a speculative page cache reference or
get_user_pages_fast()).  That means it will try to call
atomic_add_unless(&page->_refcount, 1, 0);

Actually, I wonder if this isn't a problem on x86 too?  Do we need to
explicitly go through an RCU grace period before freeing the pages
for use by somebody else?

Sorry, not sure what I'm missing, please help.

From hugetlb allocation perspective,  one of the scenarios is run time 
hugetlb page allocation (say 2M pages), starting from the buddy 
allocator returns compound pages, then the head page is set to frozen, 
then the folio(compound pages) is put thru the HVO process, one of which 
is vmemmap_split_pmd() in case a vmemmap page is a PMD page.

Until the HVO process completes, none of the vmemmap represented pages 
are available to any threads, so what are the causes for IRQ threads to 
access their vmemmap pages?

thanks!

-jane