On Tue, Jun 25, 2024 at 3:09 PM Mateusz Guzik <mjguzik@xxxxxxxxx> wrote: > > On Tue, Jun 25, 2024 at 2:08 PM Jan Kara <jack@xxxxxxx> wrote: > > > > On Sat 22-06-24 11:49:04, Yu Ma wrote: > > > alloc_fd() has a sanity check inside to make sure the struct file mapping to the > > > allocated fd is NULL. Remove this sanity check since it can be assured by > > > exisitng zero initilization and NULL set when recycling fd. > > ^^^ existing ^^^ initialization > > > > Well, since this is a sanity check, it is expected it never hits. Yet > > searching the web shows it has hit a few times in the past :). So would > > wrapping this with unlikely() give a similar performance gain while keeping > > debugability? If unlikely() does not help, I agree we can remove this since > > fd_install() actually has the same check: > > > > BUG_ON(fdt->fd[fd] != NULL); > > > > and there we need the cacheline anyway so performance impact is minimal. > > Now, this condition in alloc_fd() is nice that it does not take the kernel > > down so perhaps we could change the BUG_ON to WARN() dumping similar kind > > of info as alloc_fd()? > > > > Christian suggested just removing it. > > To my understanding the problem is not the branch per se, but the the > cacheline bounce of the fd array induced by reading the status. > > Note the thing also nullifies the pointer, kind of defeating the > BUG_ON in fd_install. > > I'm guessing it's not going to hurt to branch on it after releasing > the lock and forego nullifying, more or less: > diff --git a/fs/file.c b/fs/file.c > index a3b72aa64f11..d22b867db246 100644 > --- a/fs/file.c > +++ b/fs/file.c > @@ -524,11 +524,11 @@ static int alloc_fd(unsigned start, unsigned > end, unsigned flags) > */ > error = -EMFILE; > if (fd >= end) > - goto out; > + goto out_locked; > > error = expand_files(files, fd); > if (error < 0) > - goto out; > + goto out_locked; > > /* > * If we needed to expand the fs array we > @@ -546,15 +546,15 @@ static int alloc_fd(unsigned start, unsigned > end, unsigned flags) > else > __clear_close_on_exec(fd, fdt); > error = fd; > -#if 1 > - /* Sanity check */ > - if (rcu_access_pointer(fdt->fd[fd]) != NULL) { > + spin_unlock(&files->file_lock); > + > + if (unlikely(rcu_access_pointer(fdt->fd[fd]) != NULL)) { > printk(KERN_WARNING "alloc_fd: slot %d not NULL!\n", fd); > - rcu_assign_pointer(fdt->fd[fd], NULL); > } > -#endif Now that I sent it it is of course not safe to deref without protection from either rcu or the lock, so this would have to be wrapped with rcu_read_lock, which makes it even less appealing. Whacking the thing as in the submitted patch seems like the best way forward here. :) > > -out: > + return error; > + > +out_locked: > spin_unlock(&files->file_lock); > return error; > } > > > > > Honza > > > > > Combined with patch 1 and 2 in series, pts/blogbench-1.1.0 read improved by > > > 32%, write improved by 17% on Intel ICX 160 cores configuration with v6.10-rc4. > > > > > > Reviewed-by: Tim Chen <tim.c.chen@xxxxxxxxxxxxxxx> > > > Signed-off-by: Yu Ma <yu.ma@xxxxxxxxx> > > > --- > > > fs/file.c | 7 ------- > > > 1 file changed, 7 deletions(-) > > > > > > diff --git a/fs/file.c b/fs/file.c > > > index b4d25f6d4c19..1153b0b7ba3d 100644 > > > --- a/fs/file.c > > > +++ b/fs/file.c > > > @@ -555,13 +555,6 @@ static int alloc_fd(unsigned start, unsigned end, unsigned flags) > > > else > > > __clear_close_on_exec(fd, fdt); > > > error = fd; > > > -#if 1 > > > - /* Sanity check */ > > > - if (rcu_access_pointer(fdt->fd[fd]) != NULL) { > > > - printk(KERN_WARNING "alloc_fd: slot %d not NULL!\n", fd); > > > - rcu_assign_pointer(fdt->fd[fd], NULL); > > > - } > > > -#endif > > > > > > out: > > > spin_unlock(&files->file_lock); > > > -- > > > 2.43.0 > > > > > -- > > Jan Kara <jack@xxxxxxxx> > > SUSE Labs, CR > > > > -- > Mateusz Guzik <mjguzik gmail.com> -- Mateusz Guzik <mjguzik gmail.com>