On 09/01/25 5:01 am, Nico Pache wrote:
We should only "enter"/allocate the khugepaged mm_slot if we succeed at
allocating the PMD sized folio. Move the khugepaged_enter_vma call until
after we know the vma_alloc_folio was successful.
Why? We have the appropriate checks from thp_vma_allowable_orders() and
friends, so the VMA should be registered with khugepaged irrespective of
whether during fault time we are able to allocate a PMD-THP or not. If
we fail at fault time, it is the job of khugepaged to try to collapse it
later.
Signed-off-by: Nico Pache <npache@xxxxxxxxxx>
---
mm/huge_memory.c | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)
diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index e53d83b3e5cf..635c65e7ef63 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -1323,7 +1323,6 @@ vm_fault_t do_huge_pmd_anonymous_page(struct vm_fault *vmf)
ret = vmf_anon_prepare(vmf);
if (ret)
return ret;
- khugepaged_enter_vma(vma, vma->vm_flags);
if (!(vmf->flags & FAULT_FLAG_WRITE) &&
!mm_forbids_zeropage(vma->vm_mm) &&
@@ -1365,7 +1364,7 @@ vm_fault_t do_huge_pmd_anonymous_page(struct vm_fault *vmf)
}
return ret;
}
-
+ khugepaged_enter_vma(vma, vma->vm_flags);
return __do_huge_pmd_anonymous_page(vmf);
}
In any case, you are not achieving what you described in the patch
description: you have moved khugepaged_enter_vma() after the read fault
logic, what you want to do is to move it after
vma_alloc_anon_folio_pmd() in __do_huge_pmd_anonymous_page().