On Sat, Apr 18, 2020 at 11:09:19AM +0800, Ming Lei wrote: > Most of blk-mq drivers depend on managed IRQ's auto-affinity to setup > up queue mapping. Thomas mentioned the following point[1]: > > " > That was the constraint of managed interrupts from the very beginning: > > The driver/subsystem has to quiesce the interrupt line and the associated > queue _before_ it gets shutdown in CPU unplug and not fiddle with it > until it's restarted by the core when the CPU is plugged in again. > " > > However, current blk-mq implementation doesn't quiesce hw queue before > the last CPU in the hctx is shutdown. Even worse, CPUHP_BLK_MQ_DEAD is > one cpuhp state handled after the CPU is down, so there isn't any chance > to quiesce hctx for blk-mq wrt. CPU hotplug. > > Add new cpuhp state of CPUHP_AP_BLK_MQ_ONLINE for blk-mq to stop queues > and wait for completion of in-flight requests. > > We will stop hw queue and wait for completion of in-flight requests > when one hctx is becoming dead in the following patch. This way may > cause dead-lock for some stacking blk-mq drivers, such as dm-rq and > loop. > > Add blk-mq flag of BLK_MQ_F_NO_MANAGED_IRQ and mark it for dm-rq and > loop, so we needn't to wait for completion of in-flight requests from > dm-rq & loop, then the potential dead-lock can be avoided. The code here looks fine, but the split from the patches that actually use it instead of just adding stubs first seems odd.