And we should be careful to handle the multiple reply queue case, given the queue
shouldn't be stopped or quieseced because other reply queues are still active.
The new CPUHP state for blk-mq should be invoked after the to-be-offline
CPU is quiesced and before it becomes offline.
Hi John,
Hi Ming,
Thinking of this issue further, so far, one doable solution is to
expose reply queues
as blk-mq hw queues, as done by the following patchset:
https://lore.kernel.org/linux-block/20180205152035.15016-1-ming.lei@xxxxxxxxxx/
I thought that this patchset had fundamental issues, in terms of working
for all types of hosts. FYI, I did the backport of latest hisi_sas_v3 to
v4.15 with this patchset (as you may have noticed in my git send
mistake), but we have not got to test it yet.
On a related topic, we did test exposing reply queues as blk-mq hw
queues and generating the host-wide tag internally in the LLDD with
sbitmap, and unfortunately we were experiencing a significant
performance hit, like 2300K -> 1800K IOPs for 4K read.
We need to test this further. I don't understand why we get such a big hit.
In which global host-wide tags are shared for all blk-mq hw queues.
Also we can remove all the reply_map stuff in drivers, then solve the problem of
draining in-flight requests during unplugging CPU in a generic approach.
So you're saying that removing this reply queue stuff can make the
solution to the problem more generic, but do you have an idea of the
overall solution?
Last time, it was reported that the patchset causes performance regression,
which is actually caused by duplicated io accounting in
blk_mq_queue_tag_busy_iter(),
which should be fixed easily.
What do you think of this approach?
It would still be good to have a forward port of this patchset for
testing, if we're serious about it. Or at least this bug you mention fixed.
thanks again,
John
Thanks,
Ming Lei
.