On 08/08/24 11:35, MANISH PANDEY wrote: > > > On 8/5/2024 11:22 PM, Bart Van Assche wrote: > > On 8/5/24 10:35 AM, MANISH PANDEY wrote: > > > In our SoC's we manage Power and Perf balancing by dynamically > > > changing the IRQs based on the load. Say if we have more load, we > > > assign UFS IRQs on Large cluster CPUs and if we have less load, we > > > affine the IRQs on Small cluster CPUs. > > > > I don't think that this is compatible with the command completion code > > in the block layer core. The blk-mq code is based on the assumption that > > the association of a completion interrupt with a CPU core does not > > change. See also the blk_mq_map_queues() function and its callers. > > > IRQ <-> CPU bonded before the start of the operation and it makes sure that > completion interrupt CPU doesn't change. > > > Is this mechanism even useful? If completion interrupts are always sent > > to the CPU core that submitted the I/O, no interrupts will be sent to > > the large cluster if no code that submits I/O is running on that > > cluster. Sending e.g. all completion interrupts to the large cluster can > > be achieved by migrating all processes and threads to the large cluster. > > > >> migrating all completion interrupts to the large cluster can > >> be achieved by migrating all processes and threads to the large > >> cluster. > > Agree, this can be achieved, but then for this all the process and threads > have to be migrated to large cluster and this will have power impacts. Hence > to balance power and perf, it is not preferred way for vendors. I don't get why irq_affinity=1 is compatible with this case? Isn't this custom setup is a fully managed system by you and means you want rq_affinity=0? What do you lose if you move to rq_affinity=0? > > > > This issue is more affecting UFS MCQ devices, which usages ESI/MSI > > > IRQs and have distributed ESI IRQs for CQs. > > > Mostly we use Large cluster CPUs for binding IRQ and CQ and hence > > > completing more completions on Large cluster which won't be from > > > same capacity CPU as request may be from S/M clusters. > > > > Please use an approach that is supported by the block layer. I don't > > think that dynamically changing the IRQ affinity is compatible with the > > block layer. > > For UFS with MCQ, ESI IRQs are bounded at the time of initialization. > so basically i would like to use High Performance cluster CPUs to migrate > few completions from Mid clusters and take the advantage of high capacity > CPUs. The new change takes away this opportunity from driver. It doesn't. You want to fully customize where your completion runs without any interference from block layer from what I read. Disable rq_affinity and do what you want? Your description says you don't want the block layer to interfere with your affinity setup. > So basically we should be able to use High Performance CPUs like below > > diff --git a/block/blk-mq.c b/block/blk-mq.c > index e3c3c0c21b55..a4a2500c4ef6 100644 > --- a/block/blk-mq.c > +++ b/block/blk-mq.c > @@ -1164,7 +1164,7 @@ static inline bool blk_mq_complete_need_ipi(struct > request *rq) > if (cpu == rq->mq_ctx->cpu || > (!test_bit(QUEUE_FLAG_SAME_FORCE, &rq->q->queue_flags) && > cpus_share_cache(cpu, rq->mq_ctx->cpu) && > - cpus_equal_capacity(cpu, rq->mq_ctx->cpu))) > + arch_scale_cpu_capacity(cpu) >= > arch_scale_cpu_capacity(rq->mq_ctx->cpu))) > return false; > > This way driver can use best possible CPUs for it's use case. > > > > Thanks, > > > > Bart. > >