The patch titled Subject: mm: fix data corruption due to stale mmap reads has been added to the -mm tree. Its filename is mm-fix-data-corruption-due-to-stale-mmap-reads.patch This patch should soon appear at http://ozlabs.org/~akpm/mmots/broken-out/mm-fix-data-corruption-due-to-stale-mmap-reads.patch and later at http://ozlabs.org/~akpm/mmotm/broken-out/mm-fix-data-corruption-due-to-stale-mmap-reads.patch Before you just go and hit "reply", please: a) Consider who else should be cc'ed b) Prefer to cc a suitable mailing list as well c) Ideally: find the original patch on the mailing list and do a reply-to-all to that, adding suitable additional cc's *** Remember to use Documentation/SubmitChecklist when testing your code *** The -mm tree is included into linux-next and is updated there every 3-4 working days ------------------------------------------------------ From: Jan Kara <jack@xxxxxxx> Subject: mm: fix data corruption due to stale mmap reads Currently, we didn't invalidate page tables during invalidate_inode_pages2() for DAX. That could result in e.g. 2MiB zero page being mapped into page tables while there were already underlying blocks allocated and thus data seen through mmap were different from data seen by read(2). The following sequence reproduces the problem: - open an mmap over a 2MiB hole - read from a 2MiB hole, faulting in a 2MiB zero page - write to the hole with write(3p). The write succeeds but we incorrectly leave the 2MiB zero page mapping intact. - via the mmap, read the data that was just written. Since the zero page mapping is still intact we read back zeroes instead of the new data. Fix the problem by unconditionally calling invalidate_inode_pages2_range() in dax_iomap_actor() for new block allocations and by properly invalidating page tables in invalidate_inode_pages2_range() for DAX mappings. Fixes: c6dcf52c23d2d3fb5235cec42d7dd3f786b87d55 Link: http://lkml.kernel.org/r/20170510085419.27601-3-jack@xxxxxxx Signed-off-by: Jan Kara <jack@xxxxxxx> Signed-off-by: Ross Zwisler <ross.zwisler@xxxxxxxxxxxxxxx> Cc: Dan Williams <dan.j.williams@xxxxxxxxx> Cc: <stable@xxxxxxxxxxxxxxx> Signed-off-by: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx> --- fs/dax.c | 2 +- mm/truncate.c | 12 +++++++++++- 2 files changed, 12 insertions(+), 2 deletions(-) diff -puN fs/dax.c~mm-fix-data-corruption-due-to-stale-mmap-reads fs/dax.c --- a/fs/dax.c~mm-fix-data-corruption-due-to-stale-mmap-reads +++ a/fs/dax.c @@ -1015,7 +1015,7 @@ dax_iomap_actor(struct inode *inode, lof * into page tables. We have to tear down these mappings so that data * written by write(2) is visible in mmap. */ - if ((iomap->flags & IOMAP_F_NEW) && inode->i_mapping->nrpages) { + if (iomap->flags & IOMAP_F_NEW) { invalidate_inode_pages2_range(inode->i_mapping, pos >> PAGE_SHIFT, (end - 1) >> PAGE_SHIFT); diff -puN mm/truncate.c~mm-fix-data-corruption-due-to-stale-mmap-reads mm/truncate.c --- a/mm/truncate.c~mm-fix-data-corruption-due-to-stale-mmap-reads +++ a/mm/truncate.c @@ -686,7 +686,17 @@ int invalidate_inode_pages2_range(struct cond_resched(); index++; } - + /* + * For DAX we invalidate page tables after invalidating radix tree. We + * could invalidate page tables while invalidating each entry however + * that would be expensive. And doing range unmapping before doesn't + * work as we have no cheap way to find whether radix tree entry didn't + * get remapped later. + */ + if (dax_mapping(mapping)) { + unmap_mapping_range(mapping, (loff_t)start << PAGE_SHIFT, + (loff_t)(end - start + 1) << PAGE_SHIFT, 0); + } out: cleancache_invalidate_inode(mapping); return ret; _ Patches currently in -mm which might be from jack@xxxxxxx are mm-fix-data-corruption-due-to-stale-mmap-reads.patch ext4-return-back-to-starting-transaction-in-ext4_dax_huge_fault.patch dax-fix-data-corruption-when-fault-races-with-write.patch -- To unsubscribe from this list: send the line "unsubscribe mm-commits" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html