On Sat 22-06-24 11:49:02, Yu Ma wrote: > There is available fd in the lower 64 bits of open_fds bitmap for most cases > when we look for an available fd slot. Skip 2-levels searching via > find_next_zero_bit() for this common fast path. > > Look directly for an open bit in the lower 64 bits of open_fds bitmap when a > free slot is available there, as: > (1) The fd allocation algorithm would always allocate fd from small to large. > Lower bits in open_fds bitmap would be used much more frequently than higher > bits. > (2) After fdt is expanded (the bitmap size doubled for each time of expansion), > it would never be shrunk. The search size increases but there are few open fds > available here. > (3) find_next_zero_bit() itself has a fast path inside to speed up searching > when size<=64. > > Besides, "!start" is added to fast path condition to ensure the allocated fd is > greater than start (i.e. >=0), given alloc_fd() is only called in two scenarios: > (1) Allocating a new fd (the most common usage scenario) via > get_unused_fd_flags() to find fd start from bit 0 in fdt (i.e. start==0). > (2) Duplicating a fd (less common usage) via dup_fd() to find a fd start from > old_fd's index in fdt, which is only called by syscall fcntl. > > With the fast path added in alloc_fd(), pts/blogbench-1.1.0 read is improved > by 17% and write by 9% on Intel ICX 160 cores configuration with v6.10-rc4. > > Reviewed-by: Tim Chen <tim.c.chen@xxxxxxxxxxxxxxx> > Signed-off-by: Yu Ma <yu.ma@xxxxxxxxx> > --- > fs/file.c | 35 +++++++++++++++++++++-------------- > 1 file changed, 21 insertions(+), 14 deletions(-) > > diff --git a/fs/file.c b/fs/file.c > index a3b72aa64f11..50e900a47107 100644 > --- a/fs/file.c > +++ b/fs/file.c > @@ -515,28 +515,35 @@ static int alloc_fd(unsigned start, unsigned end, unsigned flags) > if (fd < files->next_fd) > fd = files->next_fd; > > - if (fd < fdt->max_fds) > + error = -EMFILE; > + if (likely(fd < fdt->max_fds)) { > + if (~fdt->open_fds[0] && !start) { > + fd = find_next_zero_bit(fdt->open_fds, BITS_PER_LONG, fd); So I don't think this is quite correct. If files->next_fd is set, we could end up calling find_next_zero_bit() starting from quite high offset causing a regression? Also because we don't expand in this case, we could cause access beyond end of fdtable? Finally, AFAIU this speeds up the lookup for cases where fd < 64 is available at the cost of cases where the first long is full (there we unnecessarily load open_fds[0] into cache). Did you check if the cost is visible (e.g. by making blogbench occupy first 64 fds before starting its load)? Honza > + goto fastreturn; > + } > fd = find_next_fd(fdt, fd); > + } > + > + if (unlikely(fd >= fdt->max_fds)) { > + error = expand_files(files, fd); > + if (error < 0) > + goto out; > + /* > + * If we needed to expand the fs array we > + * might have blocked - try again. > + */ > + if (error) > + goto repeat; > + } > > +fastreturn: > /* > * N.B. For clone tasks sharing a files structure, this test > * will limit the total number of files that can be opened. > */ > - error = -EMFILE; > - if (fd >= end) > + if (unlikely(fd >= end)) > goto out; > > - error = expand_files(files, fd); > - if (error < 0) > - goto out; > - > - /* > - * If we needed to expand the fs array we > - * might have blocked - try again. > - */ > - if (error) > - goto repeat; > - > if (start <= files->next_fd) > files->next_fd = fd + 1; > > -- > 2.43.0 > -- Jan Kara <jack@xxxxxxxx> SUSE Labs, CR