On Tue, Jun 14, 2022 at 10:34:53AM +0200, Christoph Hellwig wrote: > On Tue, Jun 14, 2022 at 04:23:36PM +0800, Ming Lei wrote: > > > blk_sync_queue(q); > > > blk_flush_integrity(); > > > + blk_mq_cancel_work_sync(q); > > > + > > > + blk_mq_quiesce_queue(q); > > > > quiesce queue adds a bit long delay in del_gendisk, not sure if this way may > > cause regression in big machines with lots of disks. > > It does. But at least we remove a freeze in the queue teardown path. > But either way I'd really like to get things correct first before > looking into optimizations. The removed one works at atomic mode and it is super fast. > > > > > > + if (q->elevator) { > > > + mutex_lock(&q->sysfs_lock); > > > + elevator_exit(q); > > > + mutex_unlock(&q->sysfs_lock); > > > + } > > > + rq_qos_exit(q); > > > + blk_mq_unquiesce_queue(q); > > > > Also tearing down elevator here has to be carefully, that means any > > elevator reference has to hold rcu read lock or .q_usage_counter, > > meantime it has to be checked, otherwise use-after-free may be caused. > > This is not a new pattern. We have the same locking here as a > sysfs-induced change of the elevator to none which also clears > q->elevator under a queue that is frozen and quiesced. Then looks this pattern has problem in dealing with the examples I mentioned. And the elevator usage in __blk_mq_update_nr_hw_queues() looks one old problem, but easy to fix by protecting it via sysfs_lock. And fixing blk_mq_has_sqsched() should be easy too. I will send patches later. Thanks, Ming