On Thu, 27 Jun 2024 16:27:05 -0600 Yu Zhao <yuzhao@xxxxxxxxxx> wrote: > While investigating HVO for THPs [1], it turns out that speculative > PFN walkers like compaction can race with vmemmap modifications, e.g., > > CPU 1 (vmemmap modifier) CPU 2 (speculative PFN walker) > ------------------------------- ------------------------------ > Allocates an LRU folio page1 > Sees page1 > Frees page1 > > Allocates a hugeTLB folio page2 > (page1 being a tail of page2) > > Updates vmemmap mapping page1 > get_page_unless_zero(page1) > > Even though page1->_refcount is zero after HVO, get_page_unless_zero() > can still try to modify this read-only field, resulting in a crash. Ah. So we should backport this into earlier kernels, yes? Are we able to identify a Fixes: for this? Looks difficult. This seems quite hard to trigger. Do any particular userspace actions invoke the race?