On 15/11/2019 17:57, Bart Van Assche wrote:
On 11/15/19 2:24 AM, John Garry wrote:
Bart Van Assche wrote:
> How about sharing tag sets across hardware
> queues, e.g. like in the (totally untested) patch below?
So this is similar in principle what Ming Lei came up with here:
https://lore.kernel.org/linux-block/20190531022801.10003-1-ming.lei@xxxxxxxxxx/
However your implementation looks neater, which is good.
My concern with this approach is that we can't differentiate which
tags are allocated for which hctx, and sometimes we need to know that.
Hi Bart,
An example here was blk_mq_queue_tag_busy_iter(), which iterates the
bits for each hctx. This would just be broken by that change, unless
we record which bits are associated with each hctx.
I disagree. In bt_iter() I added " && rq->mq_hctx == hctx" such that
blk_mq_queue_tag_busy_iter() only calls the callback function for
matching (hctx, rq) pairs.
OK, I see. I assumed that rq->mq_hctx was statically set when we
initially allocate the static requests per hctx; but that doesn’t appear
so - it's set in blk_mq_get_request()->blk_mq_rq_ctx_init().
Another example was __blk_mq_tag_idle(), which looks problematic.
Please elaborate.
Again, this was for the same reason being that I thought we could not
differentiate which rqs were associated with which hctx.
For debugfs, when we examine
/sys/kernel/debug/block/.../hctxX/tags_bitmap, wouldn't that be the
tags for all hctx (hctx0)?
For debugging reasons, I would say we want to know which tags are
allocated for a specific hctx, as this is tightly related to the
requests for that hctx.
That is an open issue in the patch I posted and something that needs to
be addressed. One way to address this is to change the
sbitmap_bitmap_show() calls into calls to a function that only shows
those bits for which rq->mq_hctx == hctx.
Yeah, understood.
@@ -341,8 +341,11 @@ void blk_mq_tagset_busy_iter(struct
blk_mq_tag_set *tagset,
int i;
for (i = 0; i < tagset->nr_hw_queues; i++) {
- if (tagset->tags && tagset->tags[i])
+ if (tagset->tags && tagset->tags[i]) {
blk_mq_all_tag_busy_iter(tagset->tags[i], fn, priv);
As I mentioned earlier, wouldn't this iterate over all tags for all
hctx's, when we just want the tags for hctx[i]?
Thanks,
John
[Not trimming reply for future reference]
+ if (tagset->share_tags)
+ break;
+ }
}
}
EXPORT_SYMBOL(blk_mq_tagset_busy_iter);
Since blk_mq_tagset_busy_iter() loops over all hardware queues all what
is changed is the order in which requests are examined. I am not aware
of any block driver that calls blk_mq_tagset_busy_iter() and that
depends on the order of the requests passed to the callback function.
OK, fine.
So, to me, this approach also seems viable then.
I am however not so happy with how we use blk_mq_tag_set.tags[0] for the
shared tags; I would like to use blk_mq_tag_set.shared_tags and make
blk_mq_tag_set.tags[] point at blk_mq_tag_set.shared_tags or maybe not
blk_mq_tag_set.tags[] at all. However maybe that change may be more
intrusive.
And another more real concern is that we miss a check somewhere for
rq->mq_hctx == hctx when examining the bits on the shared tags.
Thanks,
John