On 11/2/21 9:20 AM, Jens Axboe wrote: > On 11/2/21 9:08 AM, Ming Lei wrote: >> On Tue, Nov 02, 2021 at 08:57:41AM -0600, Jens Axboe wrote: >>> On 11/2/21 7:57 AM, Ming Lei wrote: >>>> On Tue, Nov 02, 2021 at 07:47:44AM -0600, Jens Axboe wrote: >>>>> On 11/2/21 7:35 AM, Ming Lei wrote: >>>>>> In case of shared tags and none io sched, batched completion still may >>>>>> be run into, and hctx->nr_active is accounted when getting driver tag, >>>>>> so it has to be updated in blk_mq_end_request_batch(). >>>>>> >>>>>> Otherwise, hctx->nr_active may become same with queue depth, then >>>>>> hctx_may_queue() always return false, then io hang is caused. >>>>>> >>>>>> Fixes the issue by updating the counter in batched way. >>>>>> >>>>>> Reported-by: Shinichiro Kawasaki <shinichiro.kawasaki@xxxxxxx> >>>>>> Fixes: f794f3351f26 ("block: add support for blk_mq_end_request_batch()") >>>>>> Signed-off-by: Ming Lei <ming.lei@xxxxxxxxxx> >>>>>> --- >>>>>> block/blk-mq.c | 15 +++++++++++++-- >>>>>> block/blk-mq.h | 12 +++++++++--- >>>>>> 2 files changed, 22 insertions(+), 5 deletions(-) >>>>>> >>>>>> diff --git a/block/blk-mq.c b/block/blk-mq.c >>>>>> index 07eb1412760b..0dbe75034f61 100644 >>>>>> --- a/block/blk-mq.c >>>>>> +++ b/block/blk-mq.c >>>>>> @@ -825,6 +825,7 @@ void blk_mq_end_request_batch(struct io_comp_batch *iob) >>>>>> struct blk_mq_hw_ctx *cur_hctx = NULL; >>>>>> struct request *rq; >>>>>> u64 now = 0; >>>>>> + int active = 0; >>>>>> >>>>>> if (iob->need_ts) >>>>>> now = ktime_get_ns(); >>>>>> @@ -846,16 +847,26 @@ void blk_mq_end_request_batch(struct io_comp_batch *iob) >>>>>> rq_qos_done(rq->q, rq); >>>>>> >>>>>> if (nr_tags == TAG_COMP_BATCH || cur_hctx != rq->mq_hctx) { >>>>>> - if (cur_hctx) >>>>>> + if (cur_hctx) { >>>>>> + if (active) >>>>>> + __blk_mq_sub_active_requests(cur_hctx, >>>>>> + active); >>>>>> blk_mq_flush_tag_batch(cur_hctx, tags, nr_tags); >>>>>> + } >>>>>> nr_tags = 0; >>>>>> + active = 0; >>>>>> cur_hctx = rq->mq_hctx; >>>>>> } >>>>>> tags[nr_tags++] = rq->tag; >>>>>> + if (rq->rq_flags & RQF_MQ_INFLIGHT) >>>>>> + active++; >>>>> >>>>> Are there any cases where either none or all of requests have the >>>>> flag set, and hence active == nr_tags? >>>> >>>> none and BLK_MQ_F_TAG_QUEUE_SHARED, and Shinichiro only observed the >>>> issue on two NSs. >>> >>> Maybe I wasn't clear enough. What I'm saying is that either all of the >>> requests will have RQF_MQ_INFLIGHT set, or none of them. Hence active >>> should be either 0, or == nr_tags. >> >> Yeah, that is right since BLK_MQ_F_TAG_QUEUE_SHARED is updated after >> queue is frozen. Meantime blk_mq_end_request_batch() is only called >> for ending successfully completed requests. >> >> Will do that in V2. > > Thanks, then it just becomes a single check in blk_mq_flush_tag_batch(), > which is a lot better than per-request. Something like this, untested. FWIW, I did apply 1-2 from this series, so just do a v2 of 3/3 and that should do it. diff --git a/block/blk-mq.c b/block/blk-mq.c index 02f70dc06ced..18dee9af4487 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -817,6 +817,8 @@ static inline void blk_mq_flush_tag_batch(struct blk_mq_hw_ctx *hctx, struct request_queue *q = hctx->queue; blk_mq_put_tags(hctx->tags, tag_array, nr_tags); + if (hctx->flags & BLK_MQ_F_TAG_QUEUE_SHARED) + __blk_mq_sub_active_requests(hctx, nr_tags); percpu_ref_put_many(&q->q_usage_counter, nr_tags); } diff --git a/block/blk-mq.h b/block/blk-mq.h index 28859fc5faee..cb0b5482ca5e 100644 --- a/block/blk-mq.h +++ b/block/blk-mq.h @@ -225,12 +225,18 @@ static inline void __blk_mq_inc_active_requests(struct blk_mq_hw_ctx *hctx) atomic_inc(&hctx->nr_active); } -static inline void __blk_mq_dec_active_requests(struct blk_mq_hw_ctx *hctx) +static inline void __blk_mq_sub_active_requests(struct blk_mq_hw_ctx *hctx, + int val) { if (blk_mq_is_shared_tags(hctx->flags)) - atomic_dec(&hctx->queue->nr_active_requests_shared_tags); + atomic_sub(val, &hctx->queue->nr_active_requests_shared_tags); else - atomic_dec(&hctx->nr_active); + atomic_sub(val, &hctx->nr_active); +} + +static inline void __blk_mq_dec_active_requests(struct blk_mq_hw_ctx *hctx) +{ + __blk_mq_sub_active_requests(hctx, 1); } static inline int __blk_mq_active_requests(struct blk_mq_hw_ctx *hctx) -- Jens Axboe