On Thu, Sep 20, 2018 at 08:23:10AM -0700, Bart Van Assche wrote: > On Thu, 2018-09-20 at 11:48 +0800, Ming Lei wrote: > > On Wed, Sep 19, 2018 at 03:45:29PM -0700, Bart Van Assche wrote: > > > + ret = -EBUSY; > > > + if (blk_requests_in_flight(q) == 0) { > > > + blk_freeze_queue_start(q); > > > + /* > > > + * Freezing a queue starts a transition of the > > > queue > > > + * usage counter to atomic mode. Wait until atomic > > > + * mode has been reached. This involves calling > > > + * call_rcu(). That call guarantees that later > > > + * blk_queue_enter() calls see the pm-only state. > > > See > > > + * also http://lwn.net/Articles/573497/. > > > + */ > > > + percpu_ref_switch_to_atomic_sync(&q- > > > >q_usage_counter); > > > + if (percpu_ref_is_zero(&q->q_usage_counter)) > > > + ret = 0; > > > + blk_mq_unfreeze_queue(q); > > > > Tejun doesn't agree on this kind of usage yet, so the ref has to be > > dropped before calling blk_mq_unfreeze_queue(). > > I read all Tejuns' recent e-mails but I have not found any e-mail from > Tejun in which he wrote that he disagrees with the above pattern. Up to now, blk_mq_unfreeze_queue() should only be called when percpu_ref_is_zero(&q->q_usage_counter) is true. I am trying to relax this limit in the following patch, but Tejun objects this approach. https://marc.info/?l=linux-block&m=153726600813090&w=2 BTW, the race with .release() may be left to user to handle by removing grabbing the ref of atomic part in this patch, and it won't be a issue for blk-mq. > > > Also, this way still can't address the race in the following link: > > > > https://marc.info/?l=linux-block&m=153732992701093&w=2 > > I think that the following patch is sufficient to fix that race: > > diff --git a/block/blk-core.c b/block/blk-core.c > index ae092ca121d5..16dd3a989753 100644 > --- a/block/blk-core.c > +++ b/block/blk-core.c > @@ -942,8 +942,6 @@ int blk_queue_enter(struct request_queue *q, > blk_mq_req_flags_t flags) > if (success) > return 0; > > - blk_pm_request_resume(q); > - > if (flags & BLK_MQ_REQ_NOWAIT) > return -EBUSY; > > @@ -958,7 +956,8 @@ int blk_queue_enter(struct request_queue *q, > blk_mq_req_flags_t flags) > > wait_event(q->mq_freeze_wq, > (atomic_read(&q->mq_freeze_depth) == 0 && > - (pm || !blk_queue_pm_only(q))) || > + (pm || (blk_pm_request_resume(q), > + !blk_queue_pm_only(q)))) || > blk_queue_dying(q)); > if (blk_queue_dying(q)) > return -ENODEV; This fix looks clever. Thanks, Ming