On Wed, Jul 3, 2024 at 4:07 PM Yu Ma <yu.ma@xxxxxxxxx> wrote: > > There is available fd in the lower 64 bits of open_fds bitmap for most cases > when we look for an available fd slot. Skip 2-levels searching via > find_next_zero_bit() for this common fast path. > > Look directly for an open bit in the lower 64 bits of open_fds bitmap when a > free slot is available there, as: > (1) The fd allocation algorithm would always allocate fd from small to large. > Lower bits in open_fds bitmap would be used much more frequently than higher > bits. > (2) After fdt is expanded (the bitmap size doubled for each time of expansion), > it would never be shrunk. The search size increases but there are few open fds > available here. > (3) There is fast path inside of find_next_zero_bit() when size<=64 to speed up > searching. > > As suggested by Mateusz Guzik <mjguzik gmail.com> and Jan Kara <jack@xxxxxxx>, > update the fast path from alloc_fd() to find_next_fd(). With which, on top of > patch 1 and 2, pts/blogbench-1.1.0 read is improved by 13% and write by 7% on > Intel ICX 160 cores configuration with v6.10-rc6. > > Reviewed-by: Tim Chen <tim.c.chen@xxxxxxxxxxxxxxx> > Signed-off-by: Yu Ma <yu.ma@xxxxxxxxx> > --- > fs/file.c | 5 +++++ > 1 file changed, 5 insertions(+) > > diff --git a/fs/file.c b/fs/file.c > index a15317db3119..f25eca311f51 100644 > --- a/fs/file.c > +++ b/fs/file.c > @@ -488,6 +488,11 @@ struct files_struct init_files = { > > static unsigned int find_next_fd(struct fdtable *fdt, unsigned int start) > { > + unsigned int bit; > + bit = find_next_zero_bit(fdt->open_fds, BITS_PER_LONG, start); > + if (bit < BITS_PER_LONG) > + return bit; > + > unsigned int maxfd = fdt->max_fds; /* always multiple of BITS_PER_LONG */ > unsigned int maxbit = maxfd / BITS_PER_LONG; > unsigned int bitbit = start / BITS_PER_LONG; > -- > 2.43.0 > I had something like this in mind: diff --git a/fs/file.c b/fs/file.c index a3b72aa64f11..4d3307e39db7 100644 --- a/fs/file.c +++ b/fs/file.c @@ -489,6 +489,16 @@ static unsigned int find_next_fd(struct fdtable *fdt, unsigned int start) unsigned int maxfd = fdt->max_fds; /* always multiple of BITS_PER_LONG */ unsigned int maxbit = maxfd / BITS_PER_LONG; unsigned int bitbit = start / BITS_PER_LONG; + unsigned int bit; + + /* + * Try to avoid looking at the second level map. + */ + bit = find_next_zero_bit(&fdt->open_fds[bitbit], BITS_PER_LONG, + start & (BITS_PER_LONG - 1)); + if (bit < BITS_PER_LONG) { + return bit + bitbit * BITS_PER_LONG; + } bitbit = find_next_zero_bit(fdt->full_fds_bits, maxbit, bitbit) * BITS_PER_LONG; if (bitbit >= maxfd) can you please test it out. I expect it to provide a tiny improvement over your patch. -- Mateusz Guzik <mjguzik gmail.com>