Re: [PATCH V4 0/5] blk-mq: improvement on handling IO during CPU hotplug

John Garry <john.garry@xxxxxxxxxx> · Mon, 28 Oct 2019 11:55:42 +0000

For the SCSI commands which timeout, I notice that
scsi_set_blocked(reason=SCSI_MLQUEUE_EH_RETRY) was called 30 seconds
earlier.

  scsi_set_blocked+0x20/0xb8
  __scsi_queue_insert+0x40/0x90
  scsi_softirq_done+0x164/0x1c8
  __blk_mq_complete_request_remote+0x18/0x20
  flush_smp_call_function_queue+0xa8/0x150
  generic_smp_call_function_single_interrupt+0x10/0x18
  handle_IPI+0xec/0x1a8
  arch_cpu_idle+0x10/0x18
  do_idle+0x1d0/0x2b0
  cpu_startup_entry+0x24/0x40
  secondary_start_kernel+0x1b4/0x208

Could you investigate a bit the reason why timeout is triggered?

Yeah, it does seem a strange coincidence that the SCSI command even 
failed and we have to retry, since these should be uncommon events. I'll 
check on this LLDD error.

Especially we suppose to drain all in-flight requests before the
last CPU of this hctx becomes offline, and it shouldn't be caused by
the hctx becoming dead, so still need you to confirm that all
in-flight requests are really drained in your test. 

ok

Or is it still
possible to dispatch to LDD after BLK_MQ_S_INTERNAL_STOPPED is set?

It shouldn't be. However it would seem that this IO had been dispatched 
to the LLDD, the hctx dies, and then we attempt to requeue on that hctx.

In theory, it shouldn't be possible, given we drain in-flight request
on the last CPU of this hctx.

Or blk_mq_hctx_next_cpu() may still run WORK_CPU_UNBOUND schedule after
all CPUs are offline, could you add debug message in that branch?

ok

I also notice that the __scsi_queue_insert() call, above, seems to retry to
requeue the request on a dead rq in calling
__scsi_queue_insert()->blk_mq_requeue_requet()->__blk_mq_requeue_request(),
***:

[ 1185.235243] psci: CPU1 killed.
[ 1185.238610] blk_mq_hctx_notify_dead cpu1 dead
request_queue=0xffff0023ace24f60 (id=19)
[ 1185.246530] blk_mq_hctx_notify_dead cpu1 dead
request_queue=0xffff0023ace23f80 (id=17)
[ 1185.254443] blk_mq_hctx_notify_dead cpu1 dead
request_queue=0xffff0023ace22fa0 (id=15)
[ 1185.262356] blk_mq_hctx_notify_dead cpu1 dead
request_queue=0xffff0023ace21fc0 (id=13)***
[ 1185.270271] blk_mq_hctx_notify_dead cpu1 dead
request_queue=0xffff0023ace20fe0 (id=11)
[ 1185.939451] scsi_softirq_done NEEDS_RETRY rq=0xffff0023b7416000
[ 1185.945359] scsi_set_blocked reason=0x1057
[ 1185.949444] __blk_mq_requeue_request request_queue=0xffff0023ace21fc0
id=13 rq=0xffff0023b7416000***

[...]

[ 1214.903455] scsi_timeout req=0xffff0023add29000 reserved=0
[ 1214.908946] scsi_timeout req=0xffff0023add29300 reserved=0
[ 1214.914424] scsi_timeout req=0xffff0023add29600 reserved=0
[ 1214.919909] scsi_timeout req=0xffff0023add29900 reserved=0

I guess that we're retrying as the SCSI failed in the LLDD for some reason.

So could this be the problem - we're attempting to requeue on a dead request
queue?

If there are any in-flight requests originated from hctx which is going
to become dead, they should have been drained before CPU becomes offline.

Sure, but we seem to hit a corner case here...

Thanks,
John

Thanks,
Ming

.