Hi Bart, Jens and Zhiguo,
On 14/03/24 03:12, Bart Van Assche wrote:
The code "max(1U, 3 * (1U << shift) / 4)" comes from the Kyber I/O
scheduler. The Kyber I/O scheduler maintains one internal queue per hwq
and hence derives its async_depth from the number of hwq tags. Using
this approach for the mq-deadline scheduler is wrong since the
mq-deadline scheduler maintains one internal queue for all hwqs
combined. Hence this revert.
Thanks a lot for helping with this performance regression[1].
Regards,
Harshit
[1]
https://lore.kernel.org/all/5ce2ae5d-61e2-4ede-ad55-551112602401@xxxxxxxxxx/
Cc: stable@xxxxxxxxxxxxxxx
Cc: Damien Le Moal <dlemoal@xxxxxxxxxx>
Cc: Harshit Mogalapalli <harshit.m.mogalapalli@xxxxxxxxxx>
Cc: Zhiguo Niu <Zhiguo.Niu@xxxxxxxxxx>
Fixes: d47f9717e5cf ("block/mq-deadline: use correct way to throttling write requests")
Signed-off-by: Bart Van Assche <bvanassche@xxxxxxx>
---
block/mq-deadline.c | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)
diff --git a/block/mq-deadline.c b/block/mq-deadline.c
index f958e79277b8..02a916ba62ee 100644
--- a/block/mq-deadline.c
+++ b/block/mq-deadline.c
@@ -646,9 +646,8 @@ static void dd_depth_updated(struct blk_mq_hw_ctx *hctx)
struct request_queue *q = hctx->queue;
struct deadline_data *dd = q->elevator->elevator_data;
struct blk_mq_tags *tags = hctx->sched_tags;
- unsigned int shift = tags->bitmap_tags.sb.shift;
- dd->async_depth = max(1U, 3 * (1U << shift) / 4);
+ dd->async_depth = max(1UL, 3 * q->nr_requests / 4);
sbitmap_queue_min_shallow_depth(&tags->bitmap_tags, dd->async_depth);
}