Thanks for the CC Bart. Manish, if you're going to send a patch to address an issue from another merged patch, the etiquet is to keep the CC list of the original patch the same and include the author of that patch in the loop. On 08/28/24 08:13, Bart Van Assche wrote: > On 8/28/24 7:49 AM, Manish Pandey wrote: > > 'Commit af550e4c9682 ("block/blk-mq: Don't complete locally if > > capacities are different")' enforces to complete the request locally > > only if the submission and completion CPUs have same capacity. > > > > To have optimal IO load balancing or to avoid contention b/w submission > > path and completion path, user may need to complete IO request of large > > capacity CPU(s) on Small Capacity CPU(s) or vice versa. > > > > Hence introduce a QUEUE_FLAG_ALLOW_DIFF_CAPACITY blk queue flag to let > > user decide if it wants to complete the request locally or need an IPI I answered you here https://lore.kernel.org/lkml/20240901171317.bm5z3vplqgdwp4bc@airbuntu/ This approach is not acceptable. I think you need to better explain why rq_affinity=0 is not usable instead of confusing rq_affinity=1 needs to be hacked in this manner. The right extension would be to teach the system how to detect cases where it is better not to keep them on the same LLC/capacity because of scenario XYZ that is known (genericaly and scalably) to break and requires an exception. rq_affinity=0 would give you what you want AFAICT and don't see a reason for this hack. > > even if the capacity of the requesting and completion queue is different. > > This gives flexibility to user to choose best CPU for their completion > > to give best performance for their system. > > I think that the following is missing from the above description: > - Mentioning that this is for an unusual interrupt routing technology > (SoC sends the interrupt to another CPU core than what has been > specified in the smp_affinity mask). > - An explanation why the desired effect cannot be achieved by changing > rq_affinity into 0. It fails to mention a lot of things from the discussion from the previous thread sadly... Including the fact that there's a strange argument about regression on a platform that is easily fixed by using rq_affinity=0, but the argument of not using this is because some other platforms don't need to use rq_affinity=0. I'm not sure if rq_affinity=1 is supposed to work for all cases especially with the specific and custom setup Manish has. Anyway. The submission has a broken CC list that omits a lot of folks from the discussion. > > > block/blk-mq-debugfs.c | 1 + > > block/blk-mq.c | 3 ++- > > block/blk-sysfs.c | 12 ++++++++++-- > > include/linux/blkdev.h | 1 + > > 4 files changed, 14 insertions(+), 3 deletions(-) > > Since the semantics of a sysfs attribute are modified, > Documentation/ABI/stable/sysfs-block should be updated. > > Thanks, > > Bart.