There is no need to write back all data before punching a hole in data=ordered|writeback mode since it will be dropped soon after removing space, so just remove the filemap_write_and_wait_range() in these modes. However, in data=journal mode, we need to write dirty pages out before discarding page cache in case of crash before committing the freeing data transaction, which could expose old, stale data. Signed-off-by: Zhang Yi <yi.zhang@xxxxxxxxxx> --- fs/ext4/inode.c | 26 +++++++++++++++----------- 1 file changed, 15 insertions(+), 11 deletions(-) diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c index f8796f7b0f94..94b923afcd9c 100644 --- a/fs/ext4/inode.c +++ b/fs/ext4/inode.c @@ -3965,17 +3965,6 @@ int ext4_punch_hole(struct file *file, loff_t offset, loff_t length) trace_ext4_punch_hole(inode, offset, length, 0); - /* - * Write out all dirty pages to avoid race conditions - * Then release them. - */ - if (mapping_tagged(mapping, PAGECACHE_TAG_DIRTY)) { - ret = filemap_write_and_wait_range(mapping, offset, - offset + length - 1); - if (ret) - return ret; - } - inode_lock(inode); /* No need to punch hole beyond i_size */ @@ -4037,6 +4026,21 @@ int ext4_punch_hole(struct file *file, loff_t offset, loff_t length) ret = ext4_update_disksize_before_punch(inode, offset, length); if (ret) goto out_dio; + + /* + * For journalled data we need to write (and checkpoint) pages + * before discarding page cache to avoid inconsitent data on + * disk in case of crash before punching trans is committed. + */ + if (ext4_should_journal_data(inode)) { + ret = filemap_write_and_wait_range(mapping, + first_block_offset, last_block_offset); + if (ret) + goto out_dio; + } + + ext4_truncate_folios_range(inode, first_block_offset, + last_block_offset + 1); truncate_pagecache_range(inode, first_block_offset, last_block_offset); } -- 2.39.2