On 20/07/2020 08:23, Kashyap Desai wrote:
Hi Kashyap,
John - I did more testing on v8 wip branch. CPU hotplug is working as
expected,
Good to hear.
but I still see some performance issue on Logical Volumes.
I created 8 Drives Raid-0 VD on MR controller and below is performance
impact of this RFC. Looks like contention is on single <sdev>.
I used command - "numactl -N 1 fio 1vd.fio --iodepth=128 --bs=4k --
rw=randread --cpus_allowed_policy=split --ioscheduler=none --
group_reporting --runtime=200 --numjobs=1"
IOPS without RFC = 300K IOPS with RFC = 230K.
Perf top (shared host tag. IOPS = 230K)
13.98% [kernel] [k] sbitmap_any_bit_set
I guess that this comes from the blk_mq_hctx_has_pending() ->
sbitmap_any_bit_set(&hctx->ctx_map) call. The
list_empty_careful(&hctx->dispatch) and blk_mq_sched_has_work(hctx)
[when scheduler=none] calls look pretty lightweight.
6.43% [kernel] [k] blk_mq_run_hw_queue
blk_mq_run_hw_queue function take more CPU which is called from "
scsi_end_request"
It looks like " blk_mq_hctx_has_pending" handles only elevator (scheduler)
case. If queue has ioscheduler=none, we can skip. I case of scheduler=none,
IO will be pushed to hardware queue and it by pass software queue.
Based on above understanding, I added below patch and I can see performance
scale back to expectation.
Ming mentioned that - we cannot remove blk_mq_run_hw_queues() from IO
completion path otherwise we may see IO hang. So I have just modified
completion path assuming it is only required for IO scheduler case.
https://www.spinics.net/lists/linux-block/msg55049.html
Please review and let me know if this is good or we have to address with
proper fix.
So what you're doing looks reasonable, but I would be concerned about
missing the blk_mq_run_hw_queue() -> blk_mq_hctx_has_pending() ->
list_empty_careful(&hctx->dispatch) check - if it's not really needed
there, then why not remove?
Hi Ming, any opinion on this?
diff --git a/block/blk-mq.c b/block/blk-mq.c
index 1be7ac5a4040..b6a5b41b7fc2 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -1559,6 +1559,9 @@ void blk_mq_run_hw_queues(struct request_queue *q,
bool async)
struct blk_mq_hw_ctx *hctx;
int i;
+ if (!q->elevator)
+ return;
+
queue_for_each_hw_ctx(q, hctx, i) {
if (blk_mq_hctx_stopped(hctx))
continue;
Thanks,
John