On Mon, 2017-11-06 at 11:44 +0800, Ming Lei wrote: > On Sun, Nov 05, 2017 at 03:38:49PM +0000, Bart Van Assche wrote: > > On Sun, 2017-11-05 at 20:10 +0800, Ming Lei wrote: > > > diff --git a/block/blk-core.c b/block/blk-core.c > > > index 048be4aa6024..0b121f29e3b1 100644 > > > --- a/block/blk-core.c > > > +++ b/block/blk-core.c > > > @@ -658,6 +658,10 @@ void blk_cleanup_queue(struct request_queue *q) > > > queue_flag_set(QUEUE_FLAG_DEAD, q); > > > spin_unlock_irq(lock); > > > > > > + /* respect queue DEAD via quiesce for blk-mq */ > > > + if (q->mq_ops) > > > + blk_mq_quiesce_queue(q); > > > + > > > /* for synchronous bio-based driver finish in-flight integrity i/o */ > > > blk_flush_integrity(); > > > > Have you considered to change the blk_freeze_queue_start() call in > > blk_set_queue_dying() into a blk_freeze_queue() call? That approach has the > > advantage that no new if (q->mq_ops) test has to be introduced. > > That approach isn't nothing to do with this issue, and can't fix this issue > too. Sorry but I disagree. My opinion is that the change I proposed is a more elegant way to avoid that blk_mq_run_hw_queue() tries to execute a request after blk_cleanup_queue() has started. > Not mention we hold q->sysfs_lock before calling blk_set_queue_dying(), > there may be risk to cause deadlock. Sorry but I disagree again. q->sysfs_lock is not obtained from inside any code that runs the queue so there is no deadlock risk. Additionally, I doubt that it is useful to obtain q->sysfs_lock from inside blk_cleanup_queue(). The queue attributes that are exported through sysfs can be modified safely until these sysfs attributes are removed. > The issue is that there isn't any request in queue(queue is frozen), but > dispatch still may happen, let me explain it a bit: > > 1) there are several IO submit paths in-progress > 2) requests from all these paths are inserted to queue, but may dispatch to > LLD in only one of these paths, but other paths may still move on to dispatch > even all these requests are completed(that means blk_mq_freeze_queue_wait() > returns at that time) > 3) the dispatch after queue dead happens and causes the use-after-free, > because we never respect queue dead for blk-mq. After a queue has been marked "dying" and after blk_freeze_queue() has returned it is guaranteed that all requests have finished and that all future blk_get_request() calls will fail. Hence .queue_rq() won't be called anymore. Any code that runs a queue (the __blk_mq_run_hw_queue() / blk_mq_make_request() callers) must hold a reference on that queue. Running a queue even after blk_cleanup_queue() finished is safe as long as the caller holds a reference on that queue. So the scenario described above won't lead to a crash. Bart.