On Tue 25-06-24 13:52:57, Jan Kara wrote: > On Sat 22-06-24 11:49:02, Yu Ma wrote: > > There is available fd in the lower 64 bits of open_fds bitmap for most cases > > when we look for an available fd slot. Skip 2-levels searching via > > find_next_zero_bit() for this common fast path. > > > > Look directly for an open bit in the lower 64 bits of open_fds bitmap when a > > free slot is available there, as: > > (1) The fd allocation algorithm would always allocate fd from small to large. > > Lower bits in open_fds bitmap would be used much more frequently than higher > > bits. > > (2) After fdt is expanded (the bitmap size doubled for each time of expansion), > > it would never be shrunk. The search size increases but there are few open fds > > available here. > > (3) find_next_zero_bit() itself has a fast path inside to speed up searching > > when size<=64. > > > > Besides, "!start" is added to fast path condition to ensure the allocated fd is > > greater than start (i.e. >=0), given alloc_fd() is only called in two scenarios: > > (1) Allocating a new fd (the most common usage scenario) via > > get_unused_fd_flags() to find fd start from bit 0 in fdt (i.e. start==0). > > (2) Duplicating a fd (less common usage) via dup_fd() to find a fd start from > > old_fd's index in fdt, which is only called by syscall fcntl. > > > > With the fast path added in alloc_fd(), pts/blogbench-1.1.0 read is improved > > by 17% and write by 9% on Intel ICX 160 cores configuration with v6.10-rc4. > > > > Reviewed-by: Tim Chen <tim.c.chen@xxxxxxxxxxxxxxx> > > Signed-off-by: Yu Ma <yu.ma@xxxxxxxxx> > > --- > > fs/file.c | 35 +++++++++++++++++++++-------------- > > 1 file changed, 21 insertions(+), 14 deletions(-) > > > > diff --git a/fs/file.c b/fs/file.c > > index a3b72aa64f11..50e900a47107 100644 > > --- a/fs/file.c > > +++ b/fs/file.c > > @@ -515,28 +515,35 @@ static int alloc_fd(unsigned start, unsigned end, unsigned flags) > > if (fd < files->next_fd) > > fd = files->next_fd; > > > > - if (fd < fdt->max_fds) > > + error = -EMFILE; > > + if (likely(fd < fdt->max_fds)) { > > + if (~fdt->open_fds[0] && !start) { > > + fd = find_next_zero_bit(fdt->open_fds, BITS_PER_LONG, fd); > > So I don't think this is quite correct. If files->next_fd is set, we could > end up calling find_next_zero_bit() starting from quite high offset causing > a regression? Also because we don't expand in this case, we could cause access > beyond end of fdtable? OK, I've misunderstood the next_fd logic. next_fd is the lowest 0-bit in the open_fds bitmap so if next_fd is big, the ~fdt->open_fds[0] should be false. As such the above condition could be rewritten as: if (!start && files->next_fd < BITS_PER_LONG) to avoid loading the first bitmap long if we know it is full? Or we could maybe go as far as: if (!start && fd < BITS_PER_LONG && !test_bit(fd, fdt->open_fds)) goto fastreturn; because AFAIU this should work in exactly the same cases as your code? Honza > > + goto fastreturn; > > + } > > fd = find_next_fd(fdt, fd); > > + } > > + > > + if (unlikely(fd >= fdt->max_fds)) { > > + error = expand_files(files, fd); > > + if (error < 0) > > + goto out; > > + /* > > + * If we needed to expand the fs array we > > + * might have blocked - try again. > > + */ > > + if (error) > > + goto repeat; > > + } > > > > +fastreturn: > > /* > > * N.B. For clone tasks sharing a files structure, this test > > * will limit the total number of files that can be opened. > > */ > > - error = -EMFILE; > > - if (fd >= end) > > + if (unlikely(fd >= end)) > > goto out; > > > > - error = expand_files(files, fd); > > - if (error < 0) > > - goto out; > > - > > - /* > > - * If we needed to expand the fs array we > > - * might have blocked - try again. > > - */ > > - if (error) > > - goto repeat; > > - > > if (start <= files->next_fd) > > files->next_fd = fd + 1; > > > > -- > > 2.43.0 > > > -- > Jan Kara <jack@xxxxxxxx> > SUSE Labs, CR > -- Jan Kara <jack@xxxxxxxx> SUSE Labs, CR