On Mon, 2017-09-11 at 19:10 +0800, Ming Lei wrote: > @@ -787,6 +787,35 @@ int blk_queue_enter(struct request_queue *q, unsigned flags) > if (percpu_ref_tryget_live(&q->q_usage_counter)) > return 0; > > + /* > + * If queue is preempt frozen and caller need to allocate > + * request for RQF_PREEMPT, we grab the .q_usage_counter > + * unconditionally and return successfully. > + * > + * There isn't race with queue cleanup because: > + * > + * 1) it is guaranteed that preempt freeze can't be > + * started after queue is set as dying > + * > + * 2) normal freeze runs exclusively with preempt > + * freeze, so even after queue is set as dying > + * afterwards, blk_queue_cleanup() won't move on > + * until preempt freeze is done > + * > + * 3) blk_queue_dying() needn't to be checked here > + * - for legacy path, it will be checked in > + * __get_request() For the legacy block layer core, what do you think will happen if the "dying" state is set by another thread after __get_request() has passed the blk_queue_dying() check? > + * - blk-mq depends on driver to handle dying well > + * because it is normal for queue to be set as dying > + * just between blk_queue_enter() and allocating new > + * request. The above comment is not correct. The block layer core handles the "dying" state. Block drivers other than dm-mpath should not have to query this state directly. > + */ > + if ((flags & BLK_REQ_PREEMPT) && > + blk_queue_is_preempt_frozen(q)) { > + blk_queue_enter_live(q); > + return 0; > + } > + Sorry but to me it looks like the above code introduces a race condition between blk_queue_cleanup() and blk_get_request() for at least blk-mq. Consider e.g. the following scenario: * A first thread preempt-freezes a request queue. * A second thread calls blk_get_request() with BLK_REQ_PREEMPT set. That results in a call of blk_queue_is_preempt_frozen(). * A context switch occurs to the first thread. * The first thread preempt-unfreezes the same request queue and calls blk_queue_cleanup(). That last function changes the request queue state into DYING and waits until all pending requests have finished. * The second thread continues and calls blk_queue_enter_live(), allocates a request and submits it. In other words, a request gets submitted against a dying queue. This must not happen. See also my explanation of queue shutdown from a few days ago (https://marc.info/?l=linux-block&m=150449845831789). Bart.