On Sun, Nov 26, 2023 at 10:21:59AM +0100, Christian Brauner wrote: > On Sat, Nov 25, 2023 at 09:17:36PM -0800, Linus Torvalds wrote: > > On Sat, 25 Nov 2023 at 21:08, Al Viro <viro@xxxxxxxxxxxxxxxxxx> wrote: > > > > > > On Sat, Nov 25, 2023 at 08:59:54PM -0800, Linus Torvalds wrote: > > > > > > > > because I for some reason (probably looking > > > > at Mateusz' original patch too much) re-implemented file_free() as > > > > fput_immediate().. > > > > > > file_free() was with RCU delay at that time, IIRC. > > > > Ahh, indeed. So it was the SLAB_TYPESAFE_BY_RCU changes that basically > > Yes, special-casing this into file_free() wasn't looking very appealing. Right, if the SLAB_TYPESAFE_BY_RCU work was already there my v1 for dodging task_work would have been much simpler (but would still have fput_badopen). While I support deduping code which comes with this patch I'm not fond of special casing failed opens in fput. A minor remark is that in the spot which ends up calling here on stock kernel it is FMODE_OPENED which is the unlikely case, but with the patch it ends up being handled with a branch marked the other way around. I noted in my commit message failed opens are not some corner-case, they are much common. The thing which irks me on its principle is that special-casing for the case which is guaranteed to not be true for majority of fput users is avoidably rolled into the general routine. For my taste the code below (untested) would be nicer, but I don't think there are solid grounds for picking one approach over another. That is to say if you insist on Al's variant then we are done here. :) diff --git a/fs/file_table.c b/fs/file_table.c index de4a2915bfd4..5e5613d80631 100644 --- a/fs/file_table.c +++ b/fs/file_table.c @@ -75,18 +75,6 @@ static inline void file_free(struct file *f) } } -void release_empty_file(struct file *f) -{ - WARN_ON_ONCE(f->f_mode & (FMODE_BACKING | FMODE_OPENED)); - if (atomic_long_dec_and_test(&f->f_count)) { - security_file_free(f); - put_cred(f->f_cred); - if (likely(!(f->f_mode & FMODE_NOACCOUNT))) - percpu_counter_dec(&nr_files); - kmem_cache_free(filp_cachep, f); - } -} - /* * Return the total number of open files in the system */ @@ -461,6 +449,18 @@ void fput(struct file *file) } } +void fput_badopen(struct file *f) +{ + if (unlikely(f->f_mode & (FMODE_BACKING | FMODE_OPENED))) { + fput(f); + return; + } + + if (likely(atomic_long_dec_and_test(&f->f_count))) { + file_free(f); + } +} + /* * synchronous analog of fput(); for kernel threads that might be needed * in some umount() (and thus can't use flush_delayed_fput() without diff --git a/fs/internal.h b/fs/internal.h index 58e43341aebf..3afe774ff7c6 100644 --- a/fs/internal.h +++ b/fs/internal.h @@ -94,7 +94,7 @@ extern void chroot_fs_refs(const struct path *, const struct path *); struct file *alloc_empty_file(int flags, const struct cred *cred); struct file *alloc_empty_file_noaccount(int flags, const struct cred *cred); struct file *alloc_empty_backing_file(int flags, const struct cred *cred); -void release_empty_file(struct file *f); +void fput_badopen(struct file *f); static inline void file_put_write_access(struct file *file) { diff --git a/fs/namei.c b/fs/namei.c index 71c13b2990b4..e42e2c237a4c 100644 --- a/fs/namei.c +++ b/fs/namei.c @@ -3785,10 +3785,7 @@ static struct file *path_openat(struct nameidata *nd, WARN_ON(1); error = -EINVAL; } - if (unlikely(file->f_mode & FMODE_OPENED)) - fput(file); - else - release_empty_file(file); + fput_badopen(file); if (error == -EOPENSTALE) { if (flags & LOOKUP_RCU) error = -ECHILD;