> On Oct 18, 2019, at 8:41 AM, Rik van Riel <riel@xxxxxx> wrote: > > On Fri, 2019-10-18 at 16:34 +0300, Kirill A. Shutemov wrote: >> On Thu, Oct 17, 2019 at 10:08:32PM -0700, Song Liu wrote: >>> In collapse_file(), after locking the page, it is necessary to >>> recheck >>> that the page is up-to-date, clean, and pointing to the proper >>> mapping. >>> If any check fails, abort the collapse. >>> >>> Fixes: 99cb0dbd47a1 ("mm,thp: add read-only THP support for (non- >>> shmem) FS") >>> Cc: Kirill A. Shutemov <kirill.shutemov@xxxxxxxxxxxxxxx> >>> Cc: Johannes Weiner <hannes@xxxxxxxxxxx> >>> Cc: Hugh Dickins <hughd@xxxxxxxxxx> >>> Cc: William Kucharski <william.kucharski@xxxxxxxxxx> >>> Cc: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx> >>> Signed-off-by: Song Liu <songliubraving@xxxxxx> >>> --- >>> mm/khugepaged.c | 8 ++++++++ >>> 1 file changed, 8 insertions(+) >>> >>> diff --git a/mm/khugepaged.c b/mm/khugepaged.c >>> index 0a1b4b484ac5..7da49b643c4d 100644 >>> --- a/mm/khugepaged.c >>> +++ b/mm/khugepaged.c >>> @@ -1619,6 +1619,14 @@ static void collapse_file(struct mm_struct >>> *mm, >>> result = SCAN_PAGE_LOCK; >>> goto xa_locked; >>> } >>> + >>> + /* double check the page is correct and clean >>> */ >>> + if (unlikely(!PageUptodate(page)) || >>> + unlikely(PageDirty(page)) || >>> + unlikely(page->mapping != mapping)) { >>> + result = SCAN_FAIL; >>> + goto out_unlock; >>> + } >>> } >>> >>> /* >> >> Hm. But why only for !is_shmem? Or I read it wrong? > > It looks like the shmem code path has its own way of bailing > out when a page is !PageUptodate. Also, shmem can handle dirty > pages fine. Seems the PageUptodate check is still necessary for shmem? shmem_getpage() makes sure the page is uptodate, but these is still a small window that the page could become !uptodate. > > However, I suppose the shmem code might want to check for truncated > pages, which it does not curretnly appear to do. I guess doing > the trylock_page under the xarray lock may protect against truncate, > but that is subtle enough that at the very least it should be > documented. Johannes pointed out in our internal code review that, there is already a page_mapping() check later in the function. PageDirty check is only necessary for !is_shmem. And it should not happen, because we only support read-only text. Adding a warning for it. Also, move PageDirty check to after page_mapping() check, because if truncate happens, the PageDirty doesn't violent the read-only assumption. Overall, I guess we need something like: ============================= 8< ============================= diff --git c/mm/khugepaged.c w/mm/khugepaged.c index 0a1b4b484ac5..40c549302d36 100644 --- c/mm/khugepaged.c +++ w/mm/khugepaged.c @@ -1626,7 +1626,12 @@ static void collapse_file(struct mm_struct *mm, * without racing with truncate. */ VM_BUG_ON_PAGE(!PageLocked(page), page); - VM_BUG_ON_PAGE(!PageUptodate(page), page); + + /* double check the page is up to date */ + if (unlikely(!PageUptodate(page))) { + result = SCAN_FAIL; + goto out_unlock; + } /* * If file was truncated then extended, or hole-punched, before @@ -1642,6 +1647,15 @@ static void collapse_file(struct mm_struct *mm, goto out_unlock; } + /* + * khugepaged should not try to collapse dirty pages for + * file THP. Show warning if this somehow happens. + */ + if (WARN_ON_ONCE(!is_shmem && PageDirty(page))) { + result = SCAN_FAIL; + goto out_unlock; + } + if (isolate_lru_page(page)) { result = SCAN_DEL_PAGE_LRU; goto out_unlock; ============================= 8< ============================= Thanks, Song