During CPU offline, in blk_mq_hctx_notify_offline(), blk_mq_hctx_has_online_cpu() returns true even though the last cpu in hctx 0 is offline because isolated cpus join hctx 0 unexpectedly, so IOs in hctx 0 won't be drained. However managed irq core code still shutdowns the hw queue's irq because all CPUs in this hctx are offline now. Then IO hang is triggered, isn't it? The current blk-mq takes static & global queue/CPUs mapping, in which all CPUs are covered. This patchset removes isolated CPUs from the mapping, and the change is big from viewpoint of blk-mq queue mapping. > > > that means one random hctx(or even NULL) may be used for submitting > > IO from isolated CPUs, > > then there can be io hang risk during cpu hotplug, or > > kernel panic when submitting bio. > > Can you elaborate a bit more? I must miss something important here. > > Anyway, my understanding is that when the last CPU of a hctx goes > offline the affinity is broken and assigned to an online HK CPU. And we > ensure all flight IO have finished and also ensure we don't submit any > new IO to a CPU which goes offline. > > FWIW, I tried really hard to get an IO hang with cpu hotplug. Please see above. thanks, Ming