Re: [PATCH V4 0/5] blk-mq: improvement on handling IO during CPU hotplug

John Garry <john.garry@xxxxxxxxxx> · Fri, 25 Oct 2019 17:33:35 +0100

There might be two reasons:

1) You are still testing a multiple reply-queue device?

As before, I am testing by exposing mutliple queues to the SCSI 
midlayer. I had to make this change locally, as on mainline we still 
only expose a single queue and use the internal reply queue when 
enabling managed interrupts.

As I
mentioned last times, it is hard to map reply-queue into blk-mq
hctx correctly.

Here's my branch, if you want to check:

https://github.com/hisilicon/kernel-dev/commits/private-topic-sas-5.4-mq-v4

It's a bit messy (sorry), but you can see that the reply-queue in the 
LLDD is removed in commit 087b95af374.

I am now thinking of actually making this change to the LLDD in mainline 
to avoid any doubt in future.


2) concurrent dispatch to device, which can be observed by the
following patch.

diff --git a/block/blk-mq.c b/block/blk-mq.c
index 06081966549f..3590f6f947eb 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -679,6 +679,8 @@ void blk_mq_start_request(struct request *rq)
 {
        struct request_queue *q = rq->q;

+       WARN_ON_ONCE(test_bit(BLK_MQ_S_INTERNAL_STOPPED, 
&rq->mq_hctx->state));
+
        trace_block_rq_issue(q, rq);

        if (test_bit(QUEUE_FLAG_STATS, &q->queue_flags)) {

However, I think it is hard to be 2#, since the current CPU is the last
CPU in hctx->cpu_mask.


I'll try it.


Hi Ming,

I am looking at this issue again.

I am using 
https://lore.kernel.org/linux-scsi/1571926881-75524-1-git-send-email-john.garry@xxxxxxxxxx/T/#t 
with expose_mq_experimental set. I guess you're going to say that this 
series is wrong, but I think it's ok for this purpose.

Forgetting that for a moment, maybe I have found an issue.

For the SCSI commands which timeout, I notice that 
scsi_set_blocked(reason=SCSI_MLQUEUE_EH_RETRY) was called 30 seconds 
earlier.

 scsi_set_blocked+0x20/0xb8
 __scsi_queue_insert+0x40/0x90
 scsi_softirq_done+0x164/0x1c8
 __blk_mq_complete_request_remote+0x18/0x20
 flush_smp_call_function_queue+0xa8/0x150
 generic_smp_call_function_single_interrupt+0x10/0x18
 handle_IPI+0xec/0x1a8
 arch_cpu_idle+0x10/0x18
 do_idle+0x1d0/0x2b0
 cpu_startup_entry+0x24/0x40
 secondary_start_kernel+0x1b4/0x208

I also notice that the __scsi_queue_insert() call, above, seems to retry 
to requeue the request on a dead rq in calling 
__scsi_queue_insert()->blk_mq_requeue_requet()->__blk_mq_requeue_request(), 
***:

[ 1185.235243] psci: CPU1 killed.
[ 1185.238610] blk_mq_hctx_notify_dead cpu1 dead 
request_queue=0xffff0023ace24f60 (id=19)
[ 1185.246530] blk_mq_hctx_notify_dead cpu1 dead 
request_queue=0xffff0023ace23f80 (id=17)
[ 1185.254443] blk_mq_hctx_notify_dead cpu1 dead 
request_queue=0xffff0023ace22fa0 (id=15)
[ 1185.262356] blk_mq_hctx_notify_dead cpu1 dead 
request_queue=0xffff0023ace21fc0 (id=13)***
[ 1185.270271] blk_mq_hctx_notify_dead cpu1 dead 
request_queue=0xffff0023ace20fe0 (id=11)
[ 1185.939451] scsi_softirq_done NEEDS_RETRY rq=0xffff0023b7416000
[ 1185.945359] scsi_set_blocked reason=0x1057
[ 1185.949444] __blk_mq_requeue_request request_queue=0xffff0023ace21fc0 
id=13 rq=0xffff0023b7416000***

[...]

[ 1214.903455] scsi_timeout req=0xffff0023add29000 reserved=0
[ 1214.908946] scsi_timeout req=0xffff0023add29300 reserved=0
[ 1214.914424] scsi_timeout req=0xffff0023add29600 reserved=0
[ 1214.919909] scsi_timeout req=0xffff0023add29900 reserved=0

I guess that we're retrying as the SCSI failed in the LLDD for some reason.

So could this be the problem - we're attempting to requeue on a dead 
request queue?

Thanks,
John

Thanks as always,
John


Thanks,
Ming