Hi Thomas, When I run test script [1] in KVM guest[2], and disk is virtio-scsi, IO hang can be triggered easily. Most times, it can be reproduced by running './cpuhotplug_io 400 /dev/sda' once, and sometimes it needs one more run. After I checked blk-mq debugfs log, I found these requests have been queued to virtio-scsi hardware, but interrupts aren't be generated. The issue is firstly found when John and I test the patchset[3][4] for draining IO in cpu hotplug handler before CPU and managed IRQ becomes shudown. And IOs are found not completed even though the CPU responsible for dealing with this hw queue is still online, but going to shutdown. git-bisect shows that the issue is introduced by the following commit: 60dcaad5736f ("x86/hotplug: Silence APIC and NMI when CPU is dead") The issue can't be triggered any more after applying the following change: diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c index 69881b2d446c..c5e9f005fbb2 100644 --- a/arch/x86/kernel/smpboot.c +++ b/arch/x86/kernel/smpboot.c @@ -1596,7 +1596,7 @@ int native_cpu_disable(void) * it. It still responds normally to INIT, NMI, SMI, and SIPI * messages. */ - apic_soft_disable(); + clear_local_APIC(); cpu_disable_common(); return 0; [1] test script http://people.redhat.com/minlei/tests/tools/cpuhotplug_io [2] virtio-scsi is MQ by passing 'num_queues=3' to qemu virtio-scsi command line, meantime set cpu number as 8, so one queue can be covered by more than one CPU [3] https://lore.kernel.org/linux-block/20200407092901.314228-5-ming.lei@xxxxxxxxxx/ [4] latest patches for stop & drain IO before shutdown irq/cpu https://github.com/ming1/linux/commits/v5.6-blk-mq-improve-cpu-hotplug Thanks, Ming