On 27/04/2021 10:11, Ming Lei wrote:
On Tue, Apr 27, 2021 at 08:52:53AM +0100, John Garry wrote:
On 27/04/2021 00:59, Ming Lei wrote:
Anyway, I'll look at adding code for a per-request queue sched tags to see
if it helps. But I would plan to continue to use a per hctx sched request
pool.
Why not switch to per hctx sched request pool?
I don't understand. The current code uses a per-hctx sched request pool, and
I said that I don't plan to change that.
I forget why you didn't do that, because for hostwide tags, request
is always 1:1 for either sched tags(real io sched) or driver tags(none).
Maybe you want to keep request local to hctx, but never see related
performance data for supporting the point, sbitmap queue allocator has
been intelligent enough to allocate tag freed from native cpu.
Then you just waste lots of memory, I remember that scsi request payload
is a bit big.
It's true that we waste much memory for regular static requests for when
using hostwide tags today.
One problem in trying to use a single set of "hostwide" static requests
is that we call blk_mq_init_request(..., hctx_idx, ...) ->
set->ops->init_request(.., hctx_idx, ...) for each static rq, and this
would not work for a single set of "hostwide" requests.
And I see a similar problem for a "request queue-wide" sched static
requests.
Maybe we can improve this in future.
BTW, for the performance issue which Yanhui witnessed with megaraid sas,
do you think it may because of the IO sched tags issue of total sched
tag depth growing vs driver tags? Are there lots of LUNs? I can imagine
that megaraid sas has much larger can_queue than scsi_debug :)
Thanks,
John