From: Zhang Yi <yi.zhang@xxxxxxxxxx> When truncating a realtime file unaligned to a shorter size, xfs_setattr_size() only flush the EOF page before zeroing out, and xfs_truncate_page() also only zeros the EOF block. This could expose stale data since 943bc0882ceb ("iomap: don't increase i_size if it's not a write operation"). If the sb_rextsize is bigger than one block, and we have a realtime inode that contains a long enough written extent. If we unaligned truncate into the middle of this extent, xfs_itruncate_extents() could split the extent and align the it's tail to sb_rextsize, there maybe have more than one blocks more between the end of the file. Since xfs_truncate_page() only zeros the trailing portion of the i_blocksize() value, so it may leftover some blocks contains stale data that could be exposed if we append write it over a long enough distance later. xfs_truncate_page() should flush, zeros out the entire rtextsize range, and make sure the entire zeroed range have been flushed to disk before updating the inode size. Fixes: 943bc0882ceb ("iomap: don't increase i_size if it's not a write operation") Reported-by: Chandan Babu R <chandanbabu@xxxxxxxxxx> Link: https://lore.kernel.org/linux-xfs/0b92a215-9d9b-3788-4504-a520778953c2@xxxxxxxxxxxxxxx Signed-off-by: Zhang Yi <yi.zhang@xxxxxxxxxx> --- fs/xfs/xfs_iomap.c | 35 +++++++++++++++++++++++++++++++---- fs/xfs/xfs_iops.c | 10 ---------- 2 files changed, 31 insertions(+), 14 deletions(-) diff --git a/fs/xfs/xfs_iomap.c b/fs/xfs/xfs_iomap.c index 4958cc3337bc..fc379450fe74 100644 --- a/fs/xfs/xfs_iomap.c +++ b/fs/xfs/xfs_iomap.c @@ -1466,12 +1466,39 @@ xfs_truncate_page( loff_t pos, bool *did_zero) { + struct xfs_mount *mp = ip->i_mount; struct inode *inode = VFS_I(ip); unsigned int blocksize = i_blocksize(inode); + int error; + + if (XFS_IS_REALTIME_INODE(ip)) + blocksize = XFS_FSB_TO_B(mp, mp->m_sb.sb_rextsize); + + /* + * iomap won't detect a dirty page over an unwritten block (or a + * cow block over a hole) and subsequently skips zeroing the + * newly post-EOF portion of the page. Flush the new EOF to + * convert the block before the pagecache truncate. + */ + error = filemap_write_and_wait_range(inode->i_mapping, pos, + roundup_64(pos, blocksize)); + if (error) + return error; if (IS_DAX(inode)) - return dax_truncate_page(inode, pos, blocksize, did_zero, - &xfs_dax_write_iomap_ops); - return iomap_truncate_page(inode, pos, blocksize, did_zero, - &xfs_buffered_write_iomap_ops); + error = dax_truncate_page(inode, pos, blocksize, did_zero, + &xfs_dax_write_iomap_ops); + else + error = iomap_truncate_page(inode, pos, blocksize, did_zero, + &xfs_buffered_write_iomap_ops); + if (error) + return error; + + /* + * Write back path won't write dirty blocks post EOF folio, + * flush the entire zeroed range before updating the inode + * size. + */ + return filemap_write_and_wait_range(inode->i_mapping, pos, + roundup_64(pos, blocksize)); } diff --git a/fs/xfs/xfs_iops.c b/fs/xfs/xfs_iops.c index 66f8c47642e8..baeeddf4a6bb 100644 --- a/fs/xfs/xfs_iops.c +++ b/fs/xfs/xfs_iops.c @@ -845,16 +845,6 @@ xfs_setattr_size( error = xfs_zero_range(ip, oldsize, newsize - oldsize, &did_zeroing); } else { - /* - * iomap won't detect a dirty page over an unwritten block (or a - * cow block over a hole) and subsequently skips zeroing the - * newly post-EOF portion of the page. Flush the new EOF to - * convert the block before the pagecache truncate. - */ - error = filemap_write_and_wait_range(inode->i_mapping, newsize, - newsize); - if (error) - return error; error = xfs_truncate_page(ip, newsize, &did_zeroing); } -- 2.39.2