The collapsing behavior of khugepaged with pages marked using MADV_FREE might cause confusion among users. For instance, allocate a 2MB chunk using mmap and later release it by MADV_FREE. Khugepaged will not collapse this chunk. From the user's perspective, it treats lazyfree pages as pte_none. However, for some pages marked as lazyfree with MADV_FREE, khugepaged might collapse this chunk and copy these pages to a new huge page. This inconsistency in behavior could be confusing for users. After a successful MADV_FREE operation, if there is no subsequent write, the kernel can free the pages at any time. Therefore, in my opinion, counting lazyfree pages in max_pte_none seems reasonable. Perhaps treating MADV_FREE like MADV_DONTNEED, not copying lazyfree pages when khugepaged collapses huge pages in the background better aligns with user expectations. Signed-off-by: Lance Yang <ioworker0@xxxxxxxxx> --- mm/khugepaged.c | 10 +++++++++- 1 file changed, 9 insertions(+), 1 deletion(-) diff --git a/mm/khugepaged.c b/mm/khugepaged.c index 2b219acb528e..6cbf46d42c6a 100644 --- a/mm/khugepaged.c +++ b/mm/khugepaged.c @@ -777,6 +777,7 @@ static int __collapse_huge_page_copy(pte_t *pte, pmd_t orig_pmd, struct vm_area_struct *vma, unsigned long address, + struct collapse_control *cc, spinlock_t *ptl, struct list_head *compound_pagelist) { @@ -797,6 +798,13 @@ static int __collapse_huge_page_copy(pte_t *pte, continue; } src_page = pte_page(pteval); + + if (cc->is_khugepaged + && !folio_test_swapbacked(page_folio(src_page))) { + clear_user_highpage(page, _address); + continue; + } + if (copy_mc_user_highpage(page, src_page, _address, vma) > 0) { result = SCAN_COPY_MC; break; @@ -1205,7 +1213,7 @@ static int collapse_huge_page(struct mm_struct *mm, unsigned long address, anon_vma_unlock_write(vma->anon_vma); result = __collapse_huge_page_copy(pte, hpage, pmd, _pmd, - vma, address, pte_ptl, + vma, address, cc, pte_ptl, &compound_pagelist); pte_unmap(pte); if (unlikely(result != SCAN_SUCCEED)) -- 2.33.1