On Wed, Jun 09, 2021 at 11:52:37PM -0700, Hugh Dickins wrote: > Running certain tests with a DEBUG_VM kernel would crash within hours, > on the total_mapcount BUG() in split_huge_page_to_list(), while trying > to free up some memory by punching a hole in a shmem huge page: split's > try_to_unmap() was unable to find all the mappings of the page (which, > on a !DEBUG_VM kernel, would then keep the huge page pinned in memory). > > Crash dumps showed two tail pages of a shmem huge page remained mapped > by pte: ptes in a non-huge-aligned vma of a gVisor process, at the end > of a long unmapped range; and no page table had yet been allocated for > the head of the huge page to be mapped into. > > Although designed to handle these odd misaligned huge-page-mapped-by-pte > cases, page_vma_mapped_walk() falls short by returning false prematurely > when !pmd_present or !pud_present or !p4d_present or !pgd_present: there > are cases when a huge page may span the boundary, with ptes present in > the next. > > Restructure page_vma_mapped_walk() as a loop to continue in these cases, > while keeping its layout much as before. Add a step_forward() helper to > advance pvmw->address across those boundaries: originally I tried to use > mm's standard p?d_addr_end() macros, but hit the same crash 512 times > less often: because of the way redundant levels are folded together, > but folded differently in different configurations, it was just too > difficult to use them correctly; and step_forward() is simpler anyway. > > Fixes: ace71a19cec5 ("mm: introduce page_vma_mapped_walk()") > Signed-off-by: Hugh Dickins <hughd@xxxxxxxxxx> > Cc: <stable@xxxxxxxxxxxxxxx> Acked-by: Kirill A. Shutemov <kirill.shutemov@xxxxxxxxxxxxxxx> -- Kirill A. Shutemov