From: David Stevens <stevensd@xxxxxxxxxxxx> In collapse_file, mark the THP as up-to-date before inserting it into the page cache. This fixes a race where folio_seek_hole_data would mistake the THP for an fallocated but unwritten page. This race is visible to userspace via data temporarily disappearing from SEEK_DATA/SEEK_HOLE, which can cause data loss for applications that use lseek to efficiently snapshot sparse shmem. Fixes: f3f0e1d2150b ("khugepaged: add support of collapse for tmpfs/shmem pages") Signed-off-by: David Stevens <stevensd@xxxxxxxxxxxx> --- mm/khugepaged.c | 9 ++++++--- 1 file changed, 6 insertions(+), 3 deletions(-) diff --git a/mm/khugepaged.c b/mm/khugepaged.c index 79be13133322..b648f1053d95 100644 --- a/mm/khugepaged.c +++ b/mm/khugepaged.c @@ -1779,10 +1779,13 @@ static int collapse_file(struct mm_struct *mm, unsigned long addr, hpage->mapping = mapping; /* - * At this point the hpage is locked and not up-to-date. - * It's safe to insert it into the page cache, because nobody would - * be able to map it or use it in another way until we unlock it. + * Mark hpage as up-to-date before inserting it into the page cache to + * prevent it from being mistaken for an fallocated but unwritten page. + * Inserting the unfinished hpage into the page cache is safe because + * it is locked, so nobody can map it or use it in another way until we + * unlock it. */ + SetPageUptodate(hpage); xas_set(&xas, start); for (index = start; index < end; index++) { -- 2.39.1.581.gbfd45094c4-goog