Re: [PATCH V4 0/5] blk-mq: improvement on handling IO during CPU hotplug

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 29/10/2019 01:50, Ming Lei wrote:
On Mon, Oct 28, 2019 at 11:55:42AM +0000, John Garry wrote:

For the SCSI commands which timeout, I notice that
scsi_set_blocked(reason=SCSI_MLQUEUE_EH_RETRY) was called 30 seconds
earlier.

   scsi_set_blocked+0x20/0xb8
   __scsi_queue_insert+0x40/0x90
   scsi_softirq_done+0x164/0x1c8
   __blk_mq_complete_request_remote+0x18/0x20
   flush_smp_call_function_queue+0xa8/0x150
   generic_smp_call_function_single_interrupt+0x10/0x18
   handle_IPI+0xec/0x1a8
   arch_cpu_idle+0x10/0x18
   do_idle+0x1d0/0x2b0
   cpu_startup_entry+0x24/0x40
   secondary_start_kernel+0x1b4/0x208

Could you investigate a bit the reason why timeout is triggered?

Yeah, it does seem a strange coincidence that the SCSI command even failed
and we have to retry, since these should be uncommon events. I'll check on
this LLDD error.


Especially we suppose to drain all in-flight requests before the
last CPU of this hctx becomes offline, and it shouldn't be caused by
the hctx becoming dead, so still need you to confirm that all
in-flight requests are really drained in your test.

ok

Or is it still
possible to dispatch to LDD after BLK_MQ_S_INTERNAL_STOPPED is set?

It shouldn't be. However it would seem that this IO had been dispatched to
the LLDD, the hctx dies, and then we attempt to requeue on that hctx.

But this patch does wait for completion of in-flight request before
shutdown the last CPU of this hctx.


Hi Ming,

It may actually be a request from a hctx which is not shut down which errors and causes the time out. I'm still checking.

BTW, Can you let me know exactly where you want the debug for "Or blk_mq_hctx_next_cpu() may still run WORK_CPU_UNBOUND schedule after
all CPUs are offline, could you add debug message in that branch?"

Thanks,
John


Thanks,
Ming

.





[Index of Archives]     [Linux RAID]     [Linux SCSI]     [Linux ATA RAID]     [IDE]     [Linux Wireless]     [Linux Kernel]     [ATH6KL]     [Linux Bluetooth]     [Linux Netdev]     [Kernel Newbies]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Device Mapper]

  Powered by Linux