On Tue, Sep 25, 2018 at 08:14:45AM -0700, Bart Van Assche wrote: > On Tue, 2018-09-25 at 08:39 -0600, Keith Busch wrote: > > On Tue, Sep 25, 2018 at 10:39:46AM +0800, jianchao.wang wrote: > > > But the issue is the left part of blk_mq_timeout_work is moved out of protection of q refcount. > > > > I'm not sure what you mean by "left part". The only part that isn't > > outside the reference with this patch is the part Bart pointed out. > > > > This looks like it may be fixed by either moving the refcount back up a > > level to all the callers of blk_mq_queue_tag_busy_iter, or add > > cancel_work_sync(&q->timeout_work) to __blk_mq_update_nr_hw_queues after > > the freeze. > > Hi Keith, > > How about applying the following (untested) patch on top of your patch? Yep, this works. I can fold in a v2. > diff --git a/block/blk-mq.c b/block/blk-mq.c > index 019f9b169887..099e203b5213 100644 > --- a/block/blk-mq.c > +++ b/block/blk-mq.c > @@ -851,6 +851,9 @@ static void blk_mq_timeout_work(struct work_struct *work) > if (!blk_mq_queue_tag_busy_iter(q, blk_mq_check_expired, &next)) > return; > > + if (!percpu_ref_tryget(&q->q_usage_counter)) > + return; > + > if (next != 0) { > mod_timer(&q->timeout, next); > } else { > @@ -866,6 +869,7 @@ static void blk_mq_timeout_work(struct work_struct *work) > blk_mq_tag_idle(hctx); > } > } > + blk_queue_exit(q); > } > > Thanks, > > Bart.