The patch titled Subject: mm/khugepaged: stop swapping in page when VM_FAULT_RETRY occurs has been added to the -mm mm-unstable branch. Its filename is mm-khugepaged-stop-swapping-in-page-when-vm_fault_retry-occurs.patch This patch will shortly appear at https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patches/mm-khugepaged-stop-swapping-in-page-when-vm_fault_retry-occurs.patch This patch will later appear in the mm-unstable branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Before you just go and hit "reply", please: a) Consider who else should be cc'ed b) Prefer to cc a suitable mailing list as well c) Ideally: find the original patch on the mailing list and do a reply-to-all to that, adding suitable additional cc's *** Remember to use Documentation/process/submit-checklist.rst when testing your code *** The -mm tree is included into linux-next via the mm-everything branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm and is updated there every 2-3 working days ------------------------------------------------------ From: Miaohe Lin <linmiaohe@xxxxxxxxxx> Subject: mm/khugepaged: stop swapping in page when VM_FAULT_RETRY occurs Date: Sat, 25 Jun 2022 17:28:11 +0800 When do_swap_page returns VM_FAULT_RETRY, we do not retry here and thus swap entry will remain in pagetable. This will result in later failure. So stop swapping in pages in this case to save cpu cycles. As A further optimization, mmap_lock is released when __collapse_huge_page_swapin() fails to avoid relocking mmap_lock. And "swapped_in++" is moved after error handling to make it more accurate. Link: https://lkml.kernel.org/r/20220625092816.4856-3-linmiaohe@xxxxxxxxxx Signed-off-by: Miaohe Lin <linmiaohe@xxxxxxxxxx> Cc: Alistair Popple <apopple@xxxxxxxxxx> Cc: Andrea Arcangeli <aarcange@xxxxxxxxxx> Cc: David Hildenbrand <david@xxxxxxxxxx> Cc: David Howells <dhowells@xxxxxxxxxx> Cc: Matthew Wilcox (Oracle) <willy@xxxxxxxxxxxxx> Cc: NeilBrown <neilb@xxxxxxx> Cc: Peter Xu <peterx@xxxxxxxxxx> Cc: Suren Baghdasaryan <surenb@xxxxxxxxxx> Cc: Vlastimil Babka <vbabka@xxxxxxx> Cc: Yang Shi <shy828301@xxxxxxxxx> Cc: Zach O'Keefe <zokeefe@xxxxxxxxxx> Signed-off-by: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx> --- mm/khugepaged.c | 32 ++++++++++++++------------------ 1 file changed, 14 insertions(+), 18 deletions(-) --- a/mm/khugepaged.c~mm-khugepaged-stop-swapping-in-page-when-vm_fault_retry-occurs +++ a/mm/khugepaged.c @@ -940,8 +940,8 @@ static int hugepage_vma_revalidate(struc * Bring missing pages in from swap, to complete THP collapse. * Only done if khugepaged_scan_pmd believes it is worthwhile. * - * Called and returns without pte mapped or spinlocks held, - * but with mmap_lock held to protect against vma changes. + * Called and returns without pte mapped or spinlocks held. + * Note that if false is returned, mmap_lock will be released. */ static bool __collapse_huge_page_swapin(struct mm_struct *mm, @@ -968,27 +968,24 @@ static bool __collapse_huge_page_swapin( pte_unmap(vmf.pte); continue; } - swapped_in++; ret = do_swap_page(&vmf); - /* do_swap_page returns VM_FAULT_RETRY with released mmap_lock */ + /* + * do_swap_page returns VM_FAULT_RETRY with released mmap_lock. + * Note we treat VM_FAULT_RETRY as VM_FAULT_ERROR here because + * we do not retry here and swap entry will remain in pagetable + * resulting in later failure. + */ if (ret & VM_FAULT_RETRY) { - mmap_read_lock(mm); - if (hugepage_vma_revalidate(mm, haddr, &vma)) { - /* vma is no longer available, don't continue to swapin */ - trace_mm_collapse_huge_page_swapin(mm, swapped_in, referenced, 0); - return false; - } - /* check if the pmd is still valid */ - if (mm_find_pmd(mm, haddr) != pmd) { - trace_mm_collapse_huge_page_swapin(mm, swapped_in, referenced, 0); - return false; - } + trace_mm_collapse_huge_page_swapin(mm, swapped_in, referenced, 0); + return false; } if (ret & VM_FAULT_ERROR) { + mmap_read_unlock(mm); trace_mm_collapse_huge_page_swapin(mm, swapped_in, referenced, 0); return false; } + swapped_in++; } /* Drain LRU add pagevec to remove extra pin on the swapped in pages */ @@ -1054,13 +1051,12 @@ static void collapse_huge_page(struct mm } /* - * __collapse_huge_page_swapin always returns with mmap_lock locked. - * If it fails, we release mmap_lock and jump out_nolock. + * __collapse_huge_page_swapin will return with mmap_lock released + * when it fails. So we jump out_nolock directly in that case. * Continuing to collapse causes inconsistency. */ if (unmapped && !__collapse_huge_page_swapin(mm, vma, address, pmd, referenced)) { - mmap_read_unlock(mm); goto out_nolock; } _ Patches currently in -mm which might be from linmiaohe@xxxxxxxxxx are mm-migration-remove-unneeded-lock-page-and-pagemovable-check.patch mm-migration-return-errno-when-isolate_huge_page-failed.patch mm-migration-fix-potential-pte_unmap-on-an-not-mapped-pte.patch mm-swapfile-make-security_vm_enough_memory_mm-work-as-expected.patch mm-swapfile-fix-possible-data-races-of-inuse_pages.patch mm-swap-remove-swap_cache_info-statistics.patch mm-vmscan-dont-try-to-reclaim-freed-folios.patch mm-page_alloc-minor-clean-up-for-memmap_init_compound.patch mm-madvise-minor-cleanup-for-swapin_walk_pmd_entry.patch mm-mmapc-fix-missing-call-to-vm_unacct_memory-in-mmap_region.patch mm-khugepaged-remove-unneeded-shmem_huge_enabled-check.patch mm-khugepaged-stop-swapping-in-page-when-vm_fault_retry-occurs.patch mm-khugepaged-trivial-typo-and-codestyle-cleanup.patch mm-khugepaged-minor-cleanup-for-collapse_file.patch mm-khugepaged-use-helper-macro-__attr_rw.patch mm-khugepaged-remove-unneeded-return-value-of-khugepaged_add_pte_mapped_thp.patch mm-khugepaged-try-to-free-transhuge-swapcache-when-possible.patch