Re: Regarding patch "block/blk-mq: Don't complete locally if capacities are different"

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 





On 8/5/2024 11:22 PM, Bart Van Assche wrote:
On 8/5/24 10:35 AM, MANISH PANDEY wrote:
In our SoC's we manage Power and Perf balancing by dynamically changing the IRQs based on the load. Say if we have more load, we assign UFS IRQs on Large cluster CPUs and if we have less load, we affine the IRQs on Small cluster CPUs.

I don't think that this is compatible with the command completion code
in the block layer core. The blk-mq code is based on the assumption that
the association of a completion interrupt with a CPU core does not
change. See also the blk_mq_map_queues() function and its callers.

IRQ <-> CPU bonded before the start of the operation and it makes sure that completion interrupt CPU doesn't change.

Is this mechanism even useful? If completion interrupts are always sent to the CPU core that submitted the I/O, no interrupts will be sent to
the large cluster if no code that submits I/O is running on that
cluster. Sending e.g. all completion interrupts to the large cluster can
be achieved by migrating all processes and threads to the large cluster.

>> migrating all completion interrupts to the large cluster can
>> be achieved by migrating all processes and threads to the large
>> cluster.

Agree, this can be achieved, but then for this all the process and threads have to be migrated to large cluster and this will have power impacts. Hence to balance power and perf, it is not preferred way for vendors.

This issue is more affecting UFS MCQ devices, which usages ESI/MSI IRQs and have distributed ESI IRQs for CQs. Mostly we use Large cluster CPUs for binding IRQ and CQ and hence completing more completions on Large cluster which won't be from same capacity CPU as request may be from S/M clusters.

Please use an approach that is supported by the block layer. I don't
think that dynamically changing the IRQ affinity is compatible with the
block layer.

For UFS with MCQ, ESI IRQs are bounded at the time of initialization.
so basically i would like to use High Performance cluster CPUs to migrate few completions from Mid clusters and take the advantage of high capacity CPUs. The new change takes away this opportunity from driver.
So basically we should be able to use High Performance CPUs like below

diff --git a/block/blk-mq.c b/block/blk-mq.c
index e3c3c0c21b55..a4a2500c4ef6 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -1164,7 +1164,7 @@ static inline bool blk_mq_complete_need_ipi(struct request *rq)
        if (cpu == rq->mq_ctx->cpu ||
            (!test_bit(QUEUE_FLAG_SAME_FORCE, &rq->q->queue_flags) &&
             cpus_share_cache(cpu, rq->mq_ctx->cpu) &&
-            cpus_equal_capacity(cpu, rq->mq_ctx->cpu)))
+ arch_scale_cpu_capacity(cpu) >= arch_scale_cpu_capacity(rq->mq_ctx->cpu)))
                return false;

This way driver can use best possible CPUs for it's use case.

Thanks,

Bart.





[Index of Archives]     [Linux RAID]     [Linux SCSI]     [Linux ATA RAID]     [IDE]     [Linux Wireless]     [Linux Kernel]     [ATH6KL]     [Linux Bluetooth]     [Linux Netdev]     [Kernel Newbies]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Device Mapper]

  Powered by Linux