On Tue, Jun 25, 2024 at 2:08 PM Jan Kara <jack@xxxxxxx> wrote: > > On Sat 22-06-24 11:49:04, Yu Ma wrote: > > alloc_fd() has a sanity check inside to make sure the struct file mapping to the > > allocated fd is NULL. Remove this sanity check since it can be assured by > > exisitng zero initilization and NULL set when recycling fd. > ^^^ existing ^^^ initialization > > Well, since this is a sanity check, it is expected it never hits. Yet > searching the web shows it has hit a few times in the past :). So would > wrapping this with unlikely() give a similar performance gain while keeping > debugability? If unlikely() does not help, I agree we can remove this since > fd_install() actually has the same check: > > BUG_ON(fdt->fd[fd] != NULL); > > and there we need the cacheline anyway so performance impact is minimal. > Now, this condition in alloc_fd() is nice that it does not take the kernel > down so perhaps we could change the BUG_ON to WARN() dumping similar kind > of info as alloc_fd()? > Christian suggested just removing it. To my understanding the problem is not the branch per se, but the the cacheline bounce of the fd array induced by reading the status. Note the thing also nullifies the pointer, kind of defeating the BUG_ON in fd_install. I'm guessing it's not going to hurt to branch on it after releasing the lock and forego nullifying, more or less: diff --git a/fs/file.c b/fs/file.c index a3b72aa64f11..d22b867db246 100644 --- a/fs/file.c +++ b/fs/file.c @@ -524,11 +524,11 @@ static int alloc_fd(unsigned start, unsigned end, unsigned flags) */ error = -EMFILE; if (fd >= end) - goto out; + goto out_locked; error = expand_files(files, fd); if (error < 0) - goto out; + goto out_locked; /* * If we needed to expand the fs array we @@ -546,15 +546,15 @@ static int alloc_fd(unsigned start, unsigned end, unsigned flags) else __clear_close_on_exec(fd, fdt); error = fd; -#if 1 - /* Sanity check */ - if (rcu_access_pointer(fdt->fd[fd]) != NULL) { + spin_unlock(&files->file_lock); + + if (unlikely(rcu_access_pointer(fdt->fd[fd]) != NULL)) { printk(KERN_WARNING "alloc_fd: slot %d not NULL!\n", fd); - rcu_assign_pointer(fdt->fd[fd], NULL); } -#endif -out: + return error; + +out_locked: spin_unlock(&files->file_lock); return error; } > Honza > > > Combined with patch 1 and 2 in series, pts/blogbench-1.1.0 read improved by > > 32%, write improved by 17% on Intel ICX 160 cores configuration with v6.10-rc4. > > > > Reviewed-by: Tim Chen <tim.c.chen@xxxxxxxxxxxxxxx> > > Signed-off-by: Yu Ma <yu.ma@xxxxxxxxx> > > --- > > fs/file.c | 7 ------- > > 1 file changed, 7 deletions(-) > > > > diff --git a/fs/file.c b/fs/file.c > > index b4d25f6d4c19..1153b0b7ba3d 100644 > > --- a/fs/file.c > > +++ b/fs/file.c > > @@ -555,13 +555,6 @@ static int alloc_fd(unsigned start, unsigned end, unsigned flags) > > else > > __clear_close_on_exec(fd, fdt); > > error = fd; > > -#if 1 > > - /* Sanity check */ > > - if (rcu_access_pointer(fdt->fd[fd]) != NULL) { > > - printk(KERN_WARNING "alloc_fd: slot %d not NULL!\n", fd); > > - rcu_assign_pointer(fdt->fd[fd], NULL); > > - } > > -#endif > > > > out: > > spin_unlock(&files->file_lock); > > -- > > 2.43.0 > > > -- > Jan Kara <jack@xxxxxxxx> > SUSE Labs, CR -- Mateusz Guzik <mjguzik gmail.com>