Re: [PATCH v3 3/3] fs/file.c: add fast path in find_next_fd()

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




On 7/5/2024 5:55 AM, Jan Kara wrote:
On Thu 04-07-24 19:44:10, Mateusz Guzik wrote:
On Wed, Jul 3, 2024 at 4:07 PM Yu Ma <yu.ma@xxxxxxxxx> wrote:
There is available fd in the lower 64 bits of open_fds bitmap for most cases
when we look for an available fd slot. Skip 2-levels searching via
find_next_zero_bit() for this common fast path.

Look directly for an open bit in the lower 64 bits of open_fds bitmap when a
free slot is available there, as:
(1) The fd allocation algorithm would always allocate fd from small to large.
Lower bits in open_fds bitmap would be used much more frequently than higher
bits.
(2) After fdt is expanded (the bitmap size doubled for each time of expansion),
it would never be shrunk. The search size increases but there are few open fds
available here.
(3) There is fast path inside of find_next_zero_bit() when size<=64 to speed up
searching.

As suggested by Mateusz Guzik <mjguzik gmail.com> and Jan Kara <jack@xxxxxxx>,
update the fast path from alloc_fd() to find_next_fd(). With which, on top of
patch 1 and 2, pts/blogbench-1.1.0 read is improved by 13% and write by 7% on
Intel ICX 160 cores configuration with v6.10-rc6.

Reviewed-by: Tim Chen <tim.c.chen@xxxxxxxxxxxxxxx>
Signed-off-by: Yu Ma <yu.ma@xxxxxxxxx>
---
  fs/file.c | 5 +++++
  1 file changed, 5 insertions(+)

diff --git a/fs/file.c b/fs/file.c
index a15317db3119..f25eca311f51 100644
--- a/fs/file.c
+++ b/fs/file.c
@@ -488,6 +488,11 @@ struct files_struct init_files = {

  static unsigned int find_next_fd(struct fdtable *fdt, unsigned int start)
  {
+       unsigned int bit;
+       bit = find_next_zero_bit(fdt->open_fds, BITS_PER_LONG, start);
+       if (bit < BITS_PER_LONG)
+               return bit;
+
         unsigned int maxfd = fdt->max_fds; /* always multiple of BITS_PER_LONG */
         unsigned int maxbit = maxfd / BITS_PER_LONG;
         unsigned int bitbit = start / BITS_PER_LONG;
--
2.43.0

I had something like this in mind:
diff --git a/fs/file.c b/fs/file.c
index a3b72aa64f11..4d3307e39db7 100644
--- a/fs/file.c
+++ b/fs/file.c
@@ -489,6 +489,16 @@ static unsigned int find_next_fd(struct fdtable
*fdt, unsigned int start)
         unsigned int maxfd = fdt->max_fds; /* always multiple of
BITS_PER_LONG */
         unsigned int maxbit = maxfd / BITS_PER_LONG;
         unsigned int bitbit = start / BITS_PER_LONG;
+       unsigned int bit;
+
+       /*
+        * Try to avoid looking at the second level map.
+        */
+       bit = find_next_zero_bit(&fdt->open_fds[bitbit], BITS_PER_LONG,
+                               start & (BITS_PER_LONG - 1));
+       if (bit < BITS_PER_LONG) {
+               return bit + bitbit * BITS_PER_LONG;
+       }
Drat, you're right. I missed that Ma did not add the proper offset to
open_fds. *This* is what I meant :)

								Honza

Just tried this on v6.10-rc6, the improvement on top of patch 1 and patch 2 is 7% for read and 3% for write, less than just check first word.

Per my understanding, its performance would be better if we can find free bit in the same word of next_fd with high possibility, but next_fd just represents the lowest possible free bit. If fds are open/close frequently and randomly, that might not always be the case, next_fd may be distributed randomly, for example, 0-65 are occupied, fd=3 is returned, next_fd will be set to 3, next time when 3 is allocated, next_fd will be set to 4, while the actual first free bit is 66 , when 66 is allocated, and fd=5 is returned, then the above process would be went through again.

Yu





[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [NTFS 3]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [NTFS 3]     [Samba]     [Device Mapper]     [CEPH Development]

  Powered by Linux