On 23/04/2021 09:43, John Garry wrote:
1) randread test on ibm-x3850x6[*] with deadline
|IOPS | FIO CPU util
------------------------------------------------
hosttags | 94k | usr=1.13%, sys=14.75%
------------------------------------------------
non hosttags | 124k | usr=1.12%, sys=10.65%,
Getting these results for mq-deadline:
hosttags
100K cpu 1.52 4.47
non-hosttags
109K cpu 1.74 5.49
So I still don't see the same CPU usage increase for hosttags.
But throughput is down, so at least I can check on that...
2) randread test on ibm-x3850x6[*] with none
|IOPS | FIO CPU util
------------------------------------------------
hosttags | 120k | usr=0.89%, sys=6.55%
------------------------------------------------
non hosttags | 121k | usr=1.07%, sys=7.35%
------------------------------------------------
Here I get:
hosttags
113K cpu 2.04 5.83
non-hosttags
108K cpu 1.71 5.05
Hi Ming,
One thing I noticed is that for the non-hosttags scenario is that I am
hitting the IO scheduler tag exhaustion path in blk_mq_get_tag() often;
here's some perf output:
|--15.88%--blk_mq_submit_bio
| |
| |--11.27%--__blk_mq_alloc_request
| | |
| | --11.19%--blk_mq_get_tag
| | |
| | |--6.00%--__blk_mq_delay_run_hw_queue
| | | |
...
| | |
| | |--3.29%--io_schedule
| | | |
....
| | | |
| | | --1.32%--io_schedule_prepare
| | |
...
| | |
| | |--0.60%--sbitmap_finish_wait
| | |
--0.56%--sbitmap_get
I don't see this for hostwide tags - this may be because we have
multiple hctx, and the IO sched tags are per hctx, so less chance of
exhaustion. But this is not from hostwide tags specifically, but for
multiple HW queues in general. As I understood, sched tags were meant to
be per request queue, right? I am reading this correctly?
I can barely remember some debate on this, but could not find the
thread. Hannes did have a patch related to topic, but was dropped:
https://lore.kernel.org/linux-scsi/20191202153914.84722-7-hare@xxxxxxx/#t
Thanks,
John