On Thu, 27 Aug 2020, Jann Horn wrote: > The preceding patches have ensured that core dumping properly takes the > mmap_lock. Thanks to that, we can now remove mmget_still_valid() and all > its users. Hi Jann, while the only tears to be shed over losing mmget_still_valid() will be tears of joy, I think you need to explain why you believe it's safe to remove the instance in mm/khugepaged.c: which you'll have found I moved just recently, to cover an extra case (sorry for not Cc'ing you). > --- a/mm/khugepaged.c > +++ b/mm/khugepaged.c > @@ -431,7 +431,7 @@ static void insert_to_mm_slots_hash(struct mm_struct *mm, > > static inline int khugepaged_test_exit(struct mm_struct *mm) > { > - return atomic_read(&mm->mm_users) == 0 || !mmget_still_valid(mm); > + return atomic_read(&mm->mm_users) == 0; > } > > static bool hugepage_vma_check(struct vm_area_struct *vma, The movement (which you have correctly followed) was in bbe98f9cadff ("khugepaged: khugepaged_test_exit() check mmget_still_valid()") but the "pmd .. physical page 0" issue is explained better in its parent 18e77600f7a1 ("khugepaged: retract_page_tables() remember to test exit") I think your core dumping is still reading the page tables without holding mmap_lock, so still vulnerable to that extra issue. It won't be as satisfying as removing all traces of mmget_still_valid(), but right now I think you should add an mm->core_state check there instead. (I do have a better solution in use, but it's a much larger patch, that will take a lot more effort to get in: checks in pte_offset_map_lock(), perhaps returning NULL when pmd is transitioning, requiring retry.) Or maybe it's me who has missed what you're doing instead. Hugh