On Wed, Jun 15, 2022 at 11:40 PM Miaohe Lin <linmiaohe@xxxxxxxxxx> wrote: > > On 2022/6/16 1:49, Yang Shi wrote: > > On Sat, Jun 11, 2022 at 1:47 AM Miaohe Lin <linmiaohe@xxxxxxxxxx> wrote: > >> > >> When do_swap_page returns VM_FAULT_RETRY, we do not retry here and thus > >> swap entry will remain in pagetable. This will result in later failure. > >> So stop swapping in pages in this case to save cpu cycles. > >> > >> Signed-off-by: Miaohe Lin <linmiaohe@xxxxxxxxxx> > >> --- > >> mm/khugepaged.c | 19 ++++++++----------- > >> 1 file changed, 8 insertions(+), 11 deletions(-) > >> > >> diff --git a/mm/khugepaged.c b/mm/khugepaged.c > >> index 73570dfffcec..a8adb2d1e9c6 100644 > >> --- a/mm/khugepaged.c > >> +++ b/mm/khugepaged.c > >> @@ -1003,19 +1003,16 @@ static bool __collapse_huge_page_swapin(struct mm_struct *mm, > >> swapped_in++; > >> ret = do_swap_page(&vmf); > >> > >> - /* do_swap_page returns VM_FAULT_RETRY with released mmap_lock */ > >> + /* > >> + * do_swap_page returns VM_FAULT_RETRY with released mmap_lock. > >> + * Note we treat VM_FAULT_RETRY as VM_FAULT_ERROR here because > >> + * we do not retry here and swap entry will remain in pagetable > >> + * resulting in later failure. > > > > Yeah, it makes sense. > > > >> + */ > >> if (ret & VM_FAULT_RETRY) { > >> mmap_read_lock(mm); > > > > A further optimization, you should not need to relock mmap_lock. You > > may consider returning a different value or passing in *locked and > > setting it to false, then check this value in the caller to skip > > unlock. > > Could we just keep the mmap_sem unlocked when __collapse_huge_page_swapin() fails due to the caller > always doing mmap_read_unlock when __collapse_huge_page_swapin() returns false and add some comments > about this behavior? This looks like a simple way for me. Yeah, that sounds better. > > > > >> - if (hugepage_vma_revalidate(mm, haddr, &vma)) { > >> - /* vma is no longer available, don't continue to swapin */ > >> - trace_mm_collapse_huge_page_swapin(mm, swapped_in, referenced, 0); > >> - return false; > >> - } > >> - /* check if the pmd is still valid */ > >> - if (mm_find_pmd(mm, haddr) != pmd) { > >> - trace_mm_collapse_huge_page_swapin(mm, swapped_in, referenced, 0); > >> - return false; > >> - } > >> + trace_mm_collapse_huge_page_swapin(mm, swapped_in, referenced, 0); > >> + return false; > >> } > >> if (ret & VM_FAULT_ERROR) { > >> trace_mm_collapse_huge_page_swapin(mm, swapped_in, referenced, 0); > > > > And I think "swapped_in++" needs to be moved after error handling. > > Do you mean do "swapped_in++" only after pages are swapped in successfully? Yes. > > Thanks! > > > > >> -- > >> 2.23.0 > >> > >> > > . > > >