On Mon, 24 Aug 2020, Hugh Dickins wrote: > On Sat, 22 Aug 2020, Hugh Dickins wrote: > > On Sat, 22 Aug 2020, Sasha Levin wrote: > > > > > > I've followed your instructions and backported the patches: > > > > > > bbe98f9cadff ("khugepaged: khugepaged_test_exit() check > > > mmget_still_valid()") - to all branches. > > > f3f99d63a815 ("khugepaged: adjust VM_BUG_ON_MM() in > > > __khugepaged_enter()") - to all branches. > > > 59ea6d06cfa9 ("coredump: fix race condition between collapse_huge_page() > > > and core dumping") - for 4.4. > > > > That's saved me time, thanks a lot for doing that work, Sasha. > > > > I've checked the results (haha, read on) and they're all fine, > > but one minor flaw in bisectability: the added 4.4 backport of > > "coredump: fix race condition..." adds a line (deleted by the next commit) > > result = SCAN_ANY_PROCESS; > > but neither "result" nor "SCAN_ANY_PROCESS" is defined in that tree, > > so that intermediate step would generate an easily fixed build error. > > > > FWIW - I don't know whether that's something to care about or not. > > Ah, but I missed that this one that we originally held back from 5.8, > did not in fact get re-added to 5.8: all the backport series have it, > but today's 5.8.4-rc1 does not have it. > > That's not a disaster - the series builds without it, and having its > fix without the fixed commit is just odd, no more unsafe than before; > but it should be re-added for a 5.8.4-rc2 or 5.8.5. I see 5.8 is at 5.8.5-rc1 today, but the commit below still missing: please re-add it, then we can all forget about it at last - thanks! Hugh >From bbe98f9cadff58cdd6a4acaeba0efa8565dabe65 Mon Sep 17 00:00:00 2001 From: Hugh Dickins <hughd@xxxxxxxxxx> Date: Thu, 6 Aug 2020 23:26:25 -0700 Subject: khugepaged: khugepaged_test_exit() check mmget_still_valid() From: Hugh Dickins <hughd@xxxxxxxxxx> commit bbe98f9cadff58cdd6a4acaeba0efa8565dabe65 upstream. Move collapse_huge_page()'s mmget_still_valid() check into khugepaged_test_exit() itself. collapse_huge_page() is used for anon THP only, and earned its mmget_still_valid() check because it inserts a huge pmd entry in place of the page table's pmd entry; whereas collapse_file()'s retract_page_tables() or collapse_pte_mapped_thp() merely clears the page table's pmd entry. But core dumping without mmap lock must have been as open to mistaking a racily cleared pmd entry for a page table at physical page 0, as exit_mmap() was. And we certainly have no interest in mapping as a THP once dumping core. Fixes: 59ea6d06cfa9 ("coredump: fix race condition between collapse_huge_page() and core dumping") Signed-off-by: Hugh Dickins <hughd@xxxxxxxxxx> Signed-off-by: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx> Cc: Andrea Arcangeli <aarcange@xxxxxxxxxx> Cc: Song Liu <songliubraving@xxxxxx> Cc: Mike Kravetz <mike.kravetz@xxxxxxxxxx> Cc: Kirill A. Shutemov <kirill.shutemov@xxxxxxxxxxxxxxx> Cc: <stable@xxxxxxxxxxxxxxx> [4.8+] Link: http://lkml.kernel.org/r/alpine.LSU.2.11.2008021217020.27773@eggly.anvils Signed-off-by: Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx> Signed-off-by: Greg Kroah-Hartman <gregkh@xxxxxxxxxxxxxxxxxxx> --- mm/khugepaged.c | 5 +---- 1 file changed, 1 insertion(+), 4 deletions(-) --- a/mm/khugepaged.c +++ b/mm/khugepaged.c @@ -431,7 +431,7 @@ static void insert_to_mm_slots_hash(stru static inline int khugepaged_test_exit(struct mm_struct *mm) { - return atomic_read(&mm->mm_users) == 0; + return atomic_read(&mm->mm_users) == 0 || !mmget_still_valid(mm); } static bool hugepage_vma_check(struct vm_area_struct *vma, @@ -1100,9 +1100,6 @@ static void collapse_huge_page(struct mm * handled by the anon_vma lock + PG_lock. */ mmap_write_lock(mm); - result = SCAN_ANY_PROCESS; - if (!mmget_still_valid(mm)) - goto out; result = hugepage_vma_revalidate(mm, address, &vma); if (result) goto out;