During some io test, I found that waitqueues can be extremly unbalanced, especially when tags are little. For example: test cmd: nr_requests is set to 64, and queue_depth is set to 32 [global] filename=/dev/sdh ioengine=libaio direct=1 allow_mounted_write=0 group_reporting [test] rw=randwrite bs=4k numjobs=512 iodepth=2 With patch 1 applied, I observe the following status: ws_active=484 ws={ {.wait_cnt=8, .waiters_cnt=117}, {.wait_cnt=8, .waiters_cnt=59}, {.wait_cnt=8, .waiters_cnt=76}, {.wait_cnt=8, .waiters_cnt=0}, {.wait_cnt=5, .waiters_cnt=24}, {.wait_cnt=8, .waiters_cnt=12}, {.wait_cnt=8, .waiters_cnt=21}, {.wait_cnt=8, .waiters_cnt=175}, } 'waiters_cnt' means how many threads are waitng for tags in the 'ws', and such extremely unbalanced status is very frequent. After reading the sbitmap code, I found there are two situations that might cause the problem: 1) blk_mq_get_tag() can call 'bt_wait_ptr()' while the threads might get tag successfully before going to wait. - patch 2 2) After a 'ws' is woken up, following blk_mq_put_tag() might wake up the same 'ws' again instead of the next one. - patch 3 I'm not sure if the unbalanced status is really a *problem* and need to be fixed, this patchset is just to improve fairness and not a thorough fix. Any comments and suggestions are welcome. Yu Kuai (3): sbitmap: record the number of waiters for each waitqueue blk-mq: call 'bt_wait_ptr()' later in blk_mq_get_tag() sbitmap: improve the fairness of waitqueues' wake up block/blk-mq-tag.c | 6 ++--- include/linux/sbitmap.h | 5 ++++ lib/sbitmap.c | 57 ++++++++++++++++++++++------------------- 3 files changed, 39 insertions(+), 29 deletions(-) -- 2.31.1