On 6/15/2024 2:31 PM, Mateusz Guzik wrote:
On Fri, Jun 14, 2024 at 12:34:14PM -0400, Yu Ma wrote:
There is available fd in the lower 64 bits of open_fds bitmap for most cases
when we look for an available fd slot. Skip 2-levels searching via
find_next_zero_bit() for this common fast path.
Look directly for an open bit in the lower 64 bits of open_fds bitmap when a
free slot is available there, as:
(1) The fd allocation algorithm would always allocate fd from small to large.
Lower bits in open_fds bitmap would be used much more frequently than higher
bits.
(2) After fdt is expanded (the bitmap size doubled for each time of expansion),
it would never be shrunk. The search size increases but there are few open fds
available here.
(3) There is fast path inside of find_next_zero_bit() when size<=64 to speed up
searching.
With the fast path added in alloc_fd() through one-time bitmap searching,
pts/blogbench-1.1.0 read is improved by 20% and write by 10% on Intel ICX 160
cores configuration with v6.8-rc6.
Reviewed-by: Tim Chen <tim.c.chen@xxxxxxxxxxxxxxx>
Signed-off-by: Yu Ma <yu.ma@xxxxxxxxx>
---
fs/file.c | 9 +++++++--
1 file changed, 7 insertions(+), 2 deletions(-)
diff --git a/fs/file.c b/fs/file.c
index 3b683b9101d8..e8d2f9ef7fd1 100644
--- a/fs/file.c
+++ b/fs/file.c
@@ -510,8 +510,13 @@ static int alloc_fd(unsigned start, unsigned end, unsigned flags)
if (fd < files->next_fd)
fd = files->next_fd;
- if (fd < fdt->max_fds)
+ if (fd < fdt->max_fds) {
+ if (~fdt->open_fds[0]) {
+ fd = find_next_zero_bit(fdt->open_fds, BITS_PER_LONG, fd);
+ goto success;
+ }
fd = find_next_fd(fdt, fd);
+ }
/*
* N.B. For clone tasks sharing a files structure, this test
@@ -531,7 +536,7 @@ static int alloc_fd(unsigned start, unsigned end, unsigned flags)
*/
if (error)
goto repeat;
-
+success:
if (start <= files->next_fd)
files->next_fd = fd + 1;
As indicated in my other e-mail it may be a process can reach a certain
fd number and then lower its rlimit(NOFILE). In that case the max_fds
field can happen to be higher and the above patch will fail to check for
the (fd < end) case.
Thanks for the good catch, replied in that mail thread for details.
--
2.43.0