On Sun, Aug 02, 2020 at 12:15:24PM -0700, Hugh Dickins wrote: > When retract_page_tables() removes a page table to make way for a huge > pmd, it holds huge page lock, i_mmap_lock_write, mmap_write_trylock and > pmd lock; but when collapse_pte_mapped_thp() does the same (to handle > the case when the original mmap_write_trylock had failed), only > mmap_write_trylock and pmd lock are held. > > That's not enough. One machine has twice crashed under load, with > "BUG: spinlock bad magic" and GPF on 6b6b6b6b6b6b6b6b. Examining the > second crash, page_vma_mapped_walk_done()'s spin_unlock of pvmw->ptl > (serving page_referenced() on a file THP, that had found a page table > at *pmd) discovers that the page table page and its lock have already > been freed by the time it comes to unlock. > > Follow the example of retract_page_tables(), but we only need one of > huge page lock or i_mmap_lock_write to secure against this: because it's > the narrower lock, and because it simplifies collapse_pte_mapped_thp() > to know the hpage earlier, choose to rely on huge page lock here. > > Fixes: 27e1f8273113 ("khugepaged: enable collapse pmd for pte-mapped THP") > Signed-off-by: Hugh Dickins <hughd@xxxxxxxxxx> > Cc: stable@xxxxxxxxxxxxxxx # v5.4+ We could avoid the page cache lookup by locking the page on the first valid PTE and recheck page->mapping, but this way is cleaner. Acked-by: Kirill A. Shutemov <kirill.shutemov@xxxxxxxxxxxxxxx> -- Kirill A. Shutemov