Re: [PATCH 3/3] fs/file.c: move sanity_check from alloc_fd() to put_unused_fd()

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sat, 2024-06-15 at 07:07 +0200, Mateusz Guzik wrote:
> On Sat, Jun 15, 2024 at 06:41:45AM +0200, Mateusz Guzik wrote:
> > On Fri, Jun 14, 2024 at 12:34:16PM -0400, Yu Ma wrote:
> > > alloc_fd() has a sanity check inside to make sure the FILE object mapping to the
> > 
> > Total nitpick: FILE is the libc thing, I would refer to it as 'struct
> > file'. See below for the actual point.
> > 
> > > Combined with patch 1 and 2 in series, pts/blogbench-1.1.0 read improved by
> > > 32%, write improved by 15% on Intel ICX 160 cores configuration with v6.8-rc6.
> > > 
> > > Reviewed-by: Tim Chen <tim.c.chen@xxxxxxxxxxxxxxx>
> > > Signed-off-by: Yu Ma <yu.ma@xxxxxxxxx>
> > > ---
> > >  fs/file.c | 14 ++++++--------
> > >  1 file changed, 6 insertions(+), 8 deletions(-)
> > > 
> > > diff --git a/fs/file.c b/fs/file.c
> > > index a0e94a178c0b..59d62909e2e3 100644
> > > --- a/fs/file.c
> > > +++ b/fs/file.c
> > > @@ -548,13 +548,6 @@ static int alloc_fd(unsigned start, unsigned end, unsigned flags)
> > >  	else
> > >  		__clear_close_on_exec(fd, fdt);
> > >  	error = fd;
> > > -#if 1
> > > -	/* Sanity check */
> > > -	if (rcu_access_pointer(fdt->fd[fd]) != NULL) {
> > > -		printk(KERN_WARNING "alloc_fd: slot %d not NULL!\n", fd);
> > > -		rcu_assign_pointer(fdt->fd[fd], NULL);
> > > -	}
> > > -#endif
> > >  
> > 
> > I was going to ask when was the last time anyone seen this fire and
> > suggest getting rid of it if enough time(tm) passed. Turns out it does
> > show up sometimes, latest result I found is 2017 vintage:
> > https://groups.google.com/g/syzkaller-bugs/c/jfQ7upCDf9s/m/RQjhDrZ7AQAJ
> > 
> > So you are moving this to another locked area, but one which does not
> > execute in the benchmark?
> > 
> > Patch 2/3 states 28% read and 14% write increase, this commit message
> > claims it goes up to 32% and 15% respectively -- pretty big. I presume
> > this has to do with bouncing a line containing the fd.
> > 
> > I would argue moving this check elsewhere is about as good as removing
> > it altogether, but that's for the vfs overlords to decide.
> > 
> > All that aside, looking at disasm of alloc_fd it is pretty clear there
> > is time to save, for example:
> > 
> > 	if (unlikely(nr >= fdt->max_fds)) {
> > 		if (fd >= end) {
> > 			error = -EMFILE;
> > 			goto out;
> > 		}
> > 		error = expand_files(fd, fd);
> > 		if (error < 0)
> > 			goto out;
> > 		if (error)
> > 			goto repeat;
> > 	}
> > 
> 
> Now that I wrote it I noticed the fd < end check has to be performed
> regardless of max_fds -- someone could have changed rlimit to a lower
> value after using a higher fd. But the main point stands: the call to
> expand_files and associated error handling don't have to be there.

To really prevent someone from mucking with rlimit, we should probably
take the task_lock to prevent do_prlimit() racing with this function.

task_lock(current->group_leader);

Tim

> 
> > This elides 2 branches and a func call in the common case. Completely
> > untested, maybe has some brainfarts, feel free to take without credit
> > and further massage the routine.
> > 
> > Moreover my disasm shows that even looking for a bit results in
> > a func call(!) to _find_next_zero_bit -- someone(tm) should probably
> > massage it into another inline.
> > 
> > After the above massaging is done and if it turns out the check has to
> > stay, you can plausibly damage-control it with prefetch -- issue it
> > immediately after finding the fd number, before any other work.
> > 
> > All that said, by the above I'm confident there is still *some*
> > performance left on the table despite the lock.
> > 
> > >  out:
> > >  	spin_unlock(&files->file_lock);
> > > @@ -572,7 +565,7 @@ int get_unused_fd_flags(unsigned flags)
> > >  }
> > >  EXPORT_SYMBOL(get_unused_fd_flags);
> > >  
> > > -static void __put_unused_fd(struct files_struct *files, unsigned int fd)
> > > +static inline void __put_unused_fd(struct files_struct *files, unsigned int fd)
> > >  {
> > >  	struct fdtable *fdt = files_fdtable(files);
> > >  	__clear_open_fd(fd, fdt);
> > > @@ -583,7 +576,12 @@ static void __put_unused_fd(struct files_struct *files, unsigned int fd)
> > >  void put_unused_fd(unsigned int fd)
> > >  {
> > >  	struct files_struct *files = current->files;
> > > +	struct fdtable *fdt = files_fdtable(files);
> > >  	spin_lock(&files->file_lock);
> > > +	if (unlikely(rcu_access_pointer(fdt->fd[fd]))) {
> > > +		printk(KERN_WARNING "put_unused_fd: slot %d not NULL!\n", fd);
> > > +		rcu_assign_pointer(fdt->fd[fd], NULL);
> > > +	}
> > >  	__put_unused_fd(files, fd);
> > >  	spin_unlock(&files->file_lock);
> > >  }
> 






[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [NTFS 3]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [NTFS 3]     [Samba]     [Device Mapper]     [CEPH Development]

  Powered by Linux