> On Oct 18, 2019, at 6:17 PM, Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx> wrote: > > On Fri, 18 Oct 2019 11:03:45 -0700 Song Liu <songliubraving@xxxxxx> wrote: > >> In collapse_file(), after locking the page, it is necessary to recheck >> that the page is up-to-date. Add PageUptodate() check for both shmem THP >> and file THP. >> >> Current khugepaged should not try to collapse dirty file THP, because it >> is limited to read only text. Add a PageDirty check and warning for file >> THP. This is added after page_mapping() check, because if the page is >> truncated, it might be dirty. > > When fixing a bug, please always fully describe the end-user visible > effects of that bug. This is vital information for people who are > considering the fix for backporting. The end user effect is, corruption in page cache. While grouping pages into a huge page, the page cache mistakenly includes some pages that are not uptodate, and considers the huge page is uptodate. I am attaching updated commit log to the end. > > I'm suspecting that you've found a race condition which can trigger a > VM_BUG_ON_PAGE(), which is rather serious. But that was just a wild > guess. Please don't make us wildly guess :( > > The old code looked rather alarming: > > } else if (!PageUptodate(page)) { > xas_unlock_irq(&xas); > wait_on_page_locked(page); > if (!trylock_page(page)) { > result = SCAN_PAGE_LOCK; > goto xa_unlocked; > } > get_page(page); > > We don't have a ref on that page. After we've released the xarray lock > we have no business playing with *page at all, correct? Yeah, this piece is not just redundant, but also buggy. I am also including some information about it. Updated commit log: ============================= 8< ============================= In collapse_file(), for !is_shmem case, current check cannot guarantee the locked page is up-to-date. Specifically, xas_unlock_irq() should not be called before lock_page() and get_page(); and it is necessary to recheck PageUptodate() after locking the page. With this bug and CONFIG_READ_ONLY_THP_FOR_FS=y, madvise(HUGE)'ed .text may contain corrupted data. This is because khugepaged mistakenly collapses some not up-to-date sub pages into a huge page, and assumes the huge page is up-to-date. This will NOT corrupt data in the disk, because the page is read-only and never written back. Fix this by properly checking PageUptodate() after locking the page. This check replaces "VM_BUG_ON_PAGE(!PageUptodate(page), page);". Also, move PageDirty() check after locking the page. Current khugepaged should not try to collapse dirty file THP, because it is limited to read-only .text. Add a warning with the PageDirty() check as it should not happen. This warning is added after page_mapping() check, because if the page is truncated, it might be dirty. Fixes: 99cb0dbd47a1 ("mm,thp: add read-only THP support for (non-shmem) FS") Cc: Kirill A. Shutemov <kirill.shutemov@xxxxxxxxxxxxxxx> Cc: Johannes Weiner <hannes@xxxxxxxxxxx> Cc: Hugh Dickins <hughd@xxxxxxxxxx> Cc: William Kucharski <william.kucharski@xxxxxxxxxx> Cc: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx> Signed-off-by: Song Liu <songliubraving@xxxxxx> ============================= 8< ============================= Thanks, Song