Re: [PATCH] blk-mq: respect queue dead via blk_mq_quiesce_queue

Bart Van Assche <Bart.VanAssche@xxxxxxx> · Mon, 6 Nov 2017 16:34:21 +0000

On Mon, 2017-11-06 at 11:44 +0800, Ming Lei wrote:
> On Sun, Nov 05, 2017 at 03:38:49PM +0000, Bart Van Assche wrote:
> > On Sun, 2017-11-05 at 20:10 +0800, Ming Lei wrote:
> > > diff --git a/block/blk-core.c b/block/blk-core.c
> > > index 048be4aa6024..0b121f29e3b1 100644
> > > --- a/block/blk-core.c
> > > +++ b/block/blk-core.c
> > > @@ -658,6 +658,10 @@ void blk_cleanup_queue(struct request_queue *q)
> > >  	queue_flag_set(QUEUE_FLAG_DEAD, q);
> > >  	spin_unlock_irq(lock);
> > >  
> > > +	/* respect queue DEAD via quiesce for blk-mq */
> > > +	if (q->mq_ops)
> > > +		blk_mq_quiesce_queue(q);
> > > +
> > >  	/* for synchronous bio-based driver finish in-flight integrity i/o */
> > >  	blk_flush_integrity();
> > 
> > Have you considered to change the blk_freeze_queue_start() call in
> > blk_set_queue_dying() into a blk_freeze_queue() call? That approach has the
> > advantage that no new if (q->mq_ops) test has to be introduced.
> 
> That approach isn't nothing to do with this issue, and can't fix this issue
> too.

Sorry but I disagree. My opinion is that the change I proposed is a more
elegant way to avoid that blk_mq_run_hw_queue() tries to execute a request
after blk_cleanup_queue() has started.

> Not mention we hold q->sysfs_lock before calling blk_set_queue_dying(),
> there may be risk to cause deadlock.

Sorry but I disagree again. q->sysfs_lock is not obtained from inside any
code that runs the queue so there is no deadlock risk. Additionally, I doubt
that it is useful to obtain q->sysfs_lock from inside blk_cleanup_queue().
The queue attributes that are exported through sysfs can be modified safely
until these sysfs attributes are removed.

> The issue is that there isn't any request in queue(queue is frozen), but
> dispatch still may happen, let me explain it a bit:
> 
> 1) there are several IO submit paths in-progress
> 2) requests from all these paths are inserted to queue, but may dispatch to
> LLD in only one of these paths, but other paths may still move on to dispatch
> even all these requests are completed(that means blk_mq_freeze_queue_wait()
> returns at that time)
> 3) the dispatch after queue dead happens and causes the use-after-free,
> because we never respect queue dead for blk-mq.

After a queue has been marked "dying" and after blk_freeze_queue() has
returned it is guaranteed that all requests have finished and that all future
blk_get_request() calls will fail. Hence .queue_rq() won't be called anymore.
Any code that runs a queue (the __blk_mq_run_hw_queue() / blk_mq_make_request()
callers) must hold a reference on that queue. Running a queue even after
blk_cleanup_queue() finished is safe as long as the caller holds a reference
on that queue. So the scenario described above won't lead to a crash.

Bart.