On Fri, 18 Oct 2019 11:03:45 -0700 Song Liu <songliubraving@xxxxxx> wrote: > In collapse_file(), after locking the page, it is necessary to recheck > that the page is up-to-date. Add PageUptodate() check for both shmem THP > and file THP. > > Current khugepaged should not try to collapse dirty file THP, because it > is limited to read only text. Add a PageDirty check and warning for file > THP. This is added after page_mapping() check, because if the page is > truncated, it might be dirty. When fixing a bug, please always fully describe the end-user visible effects of that bug. This is vital information for people who are considering the fix for backporting. I'm suspecting that you've found a race condition which can trigger a VM_BUG_ON_PAGE(), which is rather serious. But that was just a wild guess. Please don't make us wildly guess :( The old code looked rather alarming: } else if (!PageUptodate(page)) { xas_unlock_irq(&xas); wait_on_page_locked(page); if (!trylock_page(page)) { result = SCAN_PAGE_LOCK; goto xa_unlocked; } get_page(page); We don't have a ref on that page. After we've released the xarray lock we have no business playing with *page at all, correct?