On 11/26/19 12:05 PM, Ming Lei wrote: > On Tue, Nov 26, 2019 at 10:14:12AM +0100, Hannes Reinecke wrote: [ .. ] >> diff --git a/block/blk-mq-sched.c b/block/blk-mq-sched.c >> index ca22afd47b3d..f3589f42b96d 100644 >> --- a/block/blk-mq-sched.c >> +++ b/block/blk-mq-sched.c >> @@ -452,7 +452,7 @@ static void blk_mq_sched_free_tags(struct blk_mq_tag_set *set, >> { >> if (hctx->sched_tags) { >> blk_mq_free_rqs(set, hctx->sched_tags, hctx_idx); >> - blk_mq_free_rq_map(hctx->sched_tags); >> + blk_mq_free_rq_map(hctx->sched_tags, false); >> hctx->sched_tags = NULL; >> } >> } >> @@ -462,10 +462,14 @@ static int blk_mq_sched_alloc_tags(struct request_queue *q, >> unsigned int hctx_idx) >> { >> struct blk_mq_tag_set *set = q->tag_set; >> + int flags = set->flags; >> int ret; >> >> + /* Scheduler tags are never shared */ >> + set->flags &= ~BLK_MQ_F_TAG_HCTX_SHARED; >> hctx->sched_tags = blk_mq_alloc_rq_map(set, hctx_idx, q->nr_requests, >> set->reserved_tags); >> + set->flags = flags; > > This way is very fragile, race is made against other uses of > blk_mq_is_sbitmap_shared(). > We are allocating tags, I don't think we're even able to modify it at this point. > From performance viewpoint, all hctx belonging to this request queue should > share one scheduler tagset in case of BLK_MQ_F_TAG_HCTX_SHARED, cause > driver tag queue depth isn't changed. > Hmm. Now you get me confused. In an earlier mail you said: > This kind of sharing is wrong, sched tags should be request > queue wide instead of tagset wide, and each request queue has > its own & independent scheduler queue. as in v2 we _had_ shared scheduler tags, too. Did I misread your comment above? >> if (!hctx->sched_tags) >> return -ENOMEM; >> [ .. ] >> @@ -2456,7 +2459,8 @@ static void blk_mq_free_map_and_requests(struct blk_mq_tag_set *set, >> { >> if (set->tags && set->tags[hctx_idx]) { >> blk_mq_free_rqs(set, set->tags[hctx_idx], hctx_idx); >> - blk_mq_free_rq_map(set->tags[hctx_idx]); >> + blk_mq_free_rq_map(set->tags[hctx_idx], >> + blk_mq_is_sbitmap_shared(set)); >> set->tags[hctx_idx] = NULL; >> } > > Who will free the shared tags finally in case of blk_mq_is_sbitmap_shared()? > Hmm. Indeed, that bit is missing; will be adding it. >> } [ .. ] >> @@ -3168,8 +3186,17 @@ int blk_mq_update_nr_requests(struct request_queue *q, unsigned int nr) >> q->elevator->type->ops.depth_updated(hctx); >> } >> >> - if (!ret) >> + if (!ret) { >> + if (blk_mq_is_sbitmap_shared(set)) { >> + sbitmap_queue_resize(&set->shared_bitmap_tags, nr); >> + sbitmap_queue_resize(&set->shared_breserved_tags, nr); >> + } > > The above change is wrong in case of hctx->sched_tags. > Yes, we need to add a marker here if these are 'normal' or 'scheduler' tags. This will also help when allocating as then we won't need this flag twiddling you've complained about. >> q->nr_requests = nr; >> + } >> + /* >> + * if ret != 0, q->nr_requests would not be updated, yet the depth >> + * for some hctx may have changed - is that right? >> + */ >> >> blk_mq_unquiesce_queue(q); >> blk_mq_unfreeze_queue(q); [ .. ] >> diff --git a/include/linux/blk-mq.h b/include/linux/blk-mq.h >> index 147185394a25..670e9a949d32 100644 >> --- a/include/linux/blk-mq.h >> +++ b/include/linux/blk-mq.h >> @@ -109,6 +109,12 @@ struct blk_mq_tag_set { >> unsigned int flags; /* BLK_MQ_F_* */ >> void *driver_data; >> >> + struct sbitmap_queue shared_bitmap_tags; >> + struct sbitmap_queue shared_breserved_tags; >> + >> + struct sbitmap_queue sched_shared_bitmap_tags; >> + struct sbitmap_queue sched_shared_breserved_tags; >> + > > The above two fields aren't used in this patch. > Left-overs from merging. Will be removed. Cheers, Hannes -- Dr. Hannes Reinecke Teamlead Storage & Networking hare@xxxxxxx +49 911 74053 688 SUSE Software Solutions Germany GmbH, Maxfeldstr. 5, 90409 Nürnberg HRB 36809 (AG Nürnberg), GF: Felix Imendörffer