Hey Linus, /* Summary */ This is the work to cleanup and shrink struct file significantly. You should've already seen most of the work in here. Right now, (focussing on x86) struct file is 232 bytes. After this series struct file will be 184 bytes aka 3 cacheline and a spare 8 bytes for future extensions at the end of the struct. With struct file being as ubiquitous as it is this should make a difference for file heavy workloads and allow further optimizations in the future. * struct fown_struct was embeeded into struct file letting it take up 32 bytes in total when really it shouldn't even be embedded in struct file in the first place. Instead, actual users of struct fown_struct now allocate the struct on demand. This frees up 24 bytes. * Move struct file_ra_state into the union containg the cleanup hooks and move f_iocb_flags out of the union. This closes a 4 byte hole we created earlier and brings struct file to 192 bytes. Which means struct file is 3 cachelines and we managed to shrink it by 40 bytes. * Reorder struct file so that nothing crosses a cacheline. I suspect that in the future we will end up reordering some members to mitigate false sharing issues or just because someone does actually provide really good perf data. * Shrinking struct file to 192 bytes is only part of the work. Files use a slab that is SLAB_TYPESAFE_BY_RCU and when a kmem cache is created with SLAB_TYPESAFE_BY_RCU the free pointer must be located outside of the object because the cache doesn't know what part of the memory can safely be overwritten as it may be needed to prevent object recycling. That has the consequence that SLAB_TYPESAFE_BY_RCU may end up adding a new cacheline. So this also contains work to add a new kmem_cache_create_rcu() function that allows the caller to specify an offset where the freelist pointer is supposed to be placed. Thus avoiding the implicit addition of a fourth cacheline. * And finally this removes the f_version member in struct file. The f_version member isn't particularly well-defined. It is mainly used as a cookie to detect concurrent seeks when iterating directories. But it is also abused by some subsystems for completely unrelated things. It is mostly a directory and filesystem specific thing that doesn't really need to live in struct file and with its wonky semantics it really lacks a specific function. For pipes, f_version is (ab)used to defer poll notifications until a write has happened. And struct pipe_inode_info is used by multiple struct files in their ->private_data so there's no chance of pushing that down into file->private_data without introducing another pointer indirection. But pipes don't rely on f_pos_lock so this adds a union into struct file encompassing f_pos_lock and a pipe specific f_pipe member that pipes can use. This union of course can be extended to other file types and is similar to what we do in struct inode already. /* Testing */ gcc version 14.2.0 (Debian 14.2.0-3) Debian clang version 16.0.6 (27+b1) All patches are based on v6.11-rc4 and have been sitting in linux-next. No build failures or warnings were observed. /* Conflicts */ Merge conflicts with mainline ============================= No known conflicts. Merge conflicts with other trees ================================ (1) This will have merge conflict with the vfs.misc pull request sent as: https://lore.kernel.org/r/20240913-vfs-misc-348ac639e66e@brauner Assuming you merge vfs.misc first the conflict resolution looks like this: diff --cc fs/fcntl.c index 22ec683ad8f8,0587a0e221a6..f6fde75a3bd5 --- a/fs/fcntl.c +++ b/fs/fcntl.c @@@ -343,12 -414,30 +414,36 @@@ static long f_dupfd_query(int fd, struc return f.file == filp; } +/* Let the caller figure out whether a given file was just created. */ +static long f_created_query(const struct file *filp) +{ + return !!(filp->f_mode & FMODE_CREATED); +} + + static int f_owner_sig(struct file *filp, int signum, bool setsig) + { + int ret = 0; + struct fown_struct *f_owner; + + might_sleep(); + + if (setsig) { + if (!valid_signal(signum)) + return -EINVAL; + + ret = file_f_owner_allocate(filp); + if (ret) + return ret; + } + + f_owner = file_f_owner(filp); + if (setsig) + f_owner->signum = signum; + else if (f_owner) + ret = f_owner->signum; + return ret; + } + static long do_fcntl(int fd, unsigned int cmd, unsigned long arg, struct file *filp) { (2) linux-next: manual merge of the security tree with the vfs-brauner tree https://lore.kernel.org/r/20240910132740.775b92e1@xxxxxxxxxxxxxxxx The following changes since commit 47ac09b91befbb6a235ab620c32af719f8208399: Linux 6.11-rc4 (2024-08-18 13:17:27 -0700) are available in the Git repository at: git@xxxxxxxxxxxxxxxxxxx:pub/scm/linux/kernel/git/vfs/vfs tags/vfs-6.12.file for you to fetch changes up to 24a988f75c8a5f16ef935c51039700e985767eb9: Merge patch series "file: remove f_version" (2024-09-12 11:58:46 +0200) Please consider pulling these changes from the signed vfs-6.12.file tag. Note that this work provides the base for the slab pull request this cycle. So just to not mess with Vlastimil's pr I pushed two tags: (1) vfs-6.12.file (2) vfs-6.12.file.kmem The second tag only contains what slab relies on and (1) contains everything for this cycle. If you disagree with additional stuff in (1) you may still consider pulling (2). Thanks! Christian ---------------------------------------------------------------- vfs-6.12.file ---------------------------------------------------------------- Christian Brauner (27): file: reclaim 24 bytes from f_owner fs: switch f_iocb_flags and f_ra fs: pack struct file mm: remove unused argument from create_cache() mm: add kmem_cache_create_rcu() fs: use kmem_cache_create_rcu() Merge patch series "fs,mm: add kmem_cache_create_rcu()" adi: remove unused f_version ceph: remove unused f_version s390: remove unused f_version fs: add vfs_setpos_cookie() fs: add must_set_pos() fs: use must_set_pos() fs: add generic_llseek_cookie() affs: store cookie in private data ext2: store cookie in private data ext4: store cookie in private data input: remove f_version abuse ocfs2: store cookie in private data proc: store cookie in private data udf: store cookie in private data ufs: store cookie in private data ubifs: store cookie in private data fs: add f_pipe pipe: use f_pipe fs: remove f_version Merge patch series "file: remove f_version" R Sundar (1): mm: Removed @freeptr_offset to prevent doc warning drivers/char/adi.c | 1 - drivers/input/input.c | 47 ++++++----- drivers/net/tun.c | 6 ++ drivers/s390/char/hmcdrv_dev.c | 3 - drivers/tty/tty_io.c | 6 ++ fs/affs/dir.c | 44 +++++++++-- fs/ceph/dir.c | 1 - fs/ext2/dir.c | 28 ++++++- fs/ext4/dir.c | 50 ++++++------ fs/ext4/ext4.h | 2 + fs/ext4/inline.c | 7 +- fs/fcntl.c | 166 +++++++++++++++++++++++++++++++-------- fs/file_table.c | 16 ++-- fs/internal.h | 1 + fs/locks.c | 6 +- fs/notify/dnotify/dnotify.c | 6 +- fs/ocfs2/dir.c | 3 +- fs/ocfs2/file.c | 11 ++- fs/ocfs2/file.h | 1 + fs/pipe.c | 8 +- fs/proc/base.c | 30 ++++++-- fs/read_write.c | 171 +++++++++++++++++++++++++++++++---------- fs/ubifs/dir.c | 64 ++++++++++----- fs/udf/dir.c | 28 ++++++- fs/ufs/dir.c | 28 ++++++- include/linux/fs.h | 106 +++++++++++++++---------- include/linux/slab.h | 9 +++ mm/slab.h | 2 + mm/slab_common.c | 138 +++++++++++++++++++++++---------- mm/slub.c | 20 +++-- net/core/sock.c | 2 +- security/selinux/hooks.c | 2 +- security/smack/smack_lsm.c | 2 +- 33 files changed, 744 insertions(+), 271 deletions(-)