Closes: https://bugzilla.kernel.org/show_bug.cgi?id=219306 v1: https://lore.kernel.org/lkml/20240926221103.24423-1-linmaxi@xxxxxxxxx/T/ Changes from last version: Moved vfs-level changes to be ext4-level, and improved the description of the patch. This patch enables skipping no-op 'ext4_truncate' calls. Analyzing the kernel with ftrace shows ext4_truncate is being sometimes called without making any impact, and sometimes userspace programs might call ext4_truncate in vein. By detecting these calls and skipping them, cpu time is saved. I'll fix this by skipping ext4_truncate call in 'ext4_setattr' when the file's size hasn't changed AND it hasn't been truncated since the last disk space preallocation. It is meant to consider the case when ext4_truncate is being called to truncate preallocated blocks too. Notice that so far, the condition to triggering ext4_truncate by the user was: if (attr->ia_size <= oldsize) which means it is being triggered when attr->ia_size == oldsize regardless of whether there are preallocated blocks or not - if there are none, then the call is redundant. Steps: 1.Add a new inode state flag: EXT4_STATE_TRUNCATED 2.Clear the flag when ext4_fallocate is being called with FALLOC_FL_KEEP_SIZE flag to enable using ext4_truncate again, to remove preallocated disk space that may have resulted from this call. 3.Set EXT4_STATE_TRUNCATED when ext4_truncated is called successfully. 4.Don't skip ext4_truncate in ext4_setattr when the size of the file has either been reduced OR stayed the same, but hasn't been truncated yet. This is in order to allow truncating of preallocated blocks. Signed-off-by: Max Brener <linmaxi@xxxxxxxxx> --- fs/ext4/ext4.h | 1 + fs/ext4/extents.c | 4 ++++ fs/ext4/inode.c | 6 +++++- 3 files changed, 10 insertions(+), 1 deletion(-) diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h index 44b0d418143c..34128a88e88f 100644 --- a/fs/ext4/ext4.h +++ b/fs/ext4/ext4.h @@ -1915,6 +1915,7 @@ enum { EXT4_STATE_VERITY_IN_PROGRESS, /* building fs-verity Merkle tree */ EXT4_STATE_FC_COMMITTING, /* Fast commit ongoing */ EXT4_STATE_ORPHAN_FILE, /* Inode orphaned in orphan file */ + EXT4_STATE_TRUNCATED, /* Inode is truncated */ }; #define EXT4_INODE_BIT_FNS(name, field, offset) \ diff --git a/fs/ext4/extents.c b/fs/ext4/extents.c index 34e25eee6521..b480c29dfc65 100644 --- a/fs/ext4/extents.c +++ b/fs/ext4/extents.c @@ -4782,6 +4782,10 @@ long ext4_fallocate(struct file *file, int mode, loff_t offset, loff_t len) ret = ext4_zero_range(file, offset, len, mode); goto exit; } + + if (mode & FALLOC_FL_KEEP_SIZE) + ext4_clear_inode_state(inode, EXT4_STATE_TRUNCATED); + trace_ext4_fallocate_enter(inode, offset, len, mode); lblk = offset >> blkbits; diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c index 54bdd4884fe6..cbdad3253920 100644 --- a/fs/ext4/inode.c +++ b/fs/ext4/inode.c @@ -4193,6 +4193,8 @@ int ext4_truncate(struct inode *inode) if (IS_SYNC(inode)) ext4_handle_sync(handle); + ext4_set_inode_state(inode, EXT4_STATE_TRUNCATED); + out_stop: /* * If this was a simple ftruncate() and the file will remain alive, @@ -5492,7 +5494,9 @@ int ext4_setattr(struct mnt_idmap *idmap, struct dentry *dentry, * Call ext4_truncate() even if i_size didn't change to * truncate possible preallocated blocks. */ - if (attr->ia_size <= oldsize) { + if (attr->ia_size < oldsize || + (attr->ia_size == oldsize && + !ext4_test_inode_state(inode, EXT4_STATE_TRUNCATED))) { rc = ext4_truncate(inode); if (rc) error = rc; -- 2.43.0