On Wed, Feb 07, 2024 at 12:11:25PM +0000, Will Deacon wrote: > On Wed, Feb 07, 2024 at 11:21:17AM +0000, Matthew Wilcox wrote: > > The pte lock cannot be taken in irq context (which I think is what > > you're asking?) While it is not possible to reason about all users of > > struct page, we are somewhat relieved of that work by noting that this is > > only for hugetlbfs, so we don't need to reason about slab, page tables, > > netmem or zsmalloc. > > My concern is that an interrupt handler tries to access a 'struct page' > which faults due to another core splitting a pmd mapping for the vmemmap. > In this case, I think we'll end up trying to resolve the fault from irq > context, which will try to take the spinlock. Yes, this absolutely can happen (with this patch), and this patch should be dropped for now. While this array of ~512 pages have been allocated to hugetlbfs, and one would think that there would be no way that there could still be references to them, another CPU can have a pointer to this struct page (eg attempting a speculative page cache reference or get_user_pages_fast()). That means it will try to call atomic_add_unless(&page->_refcount, 1, 0); Actually, I wonder if this isn't a problem on x86 too? Do we need to explicitly go through an RCU grace period before freeing the pages for use by somebody else?