[RFC PATCH] irqchip/gic-v3: Try to distribute irq affinity to the less distributed CPU

marc.zyngier@xxxxxxx (Marc Zyngier) · Sat, 19 May 2018 11:04:06 +0100

On Sat, 19 May 2018 07:35:55 +0800
Shawn Lin <shawn.lin at rock-chips.com> wrote:

> Hi Marc,
> 
> On 2018/5/18 18:05, Marc Zyngier wrote:
> > Hi Shawn,
> > 
> > On 18/05/18 10:47, Shawn Lin wrote:  
> >> gic-v3 seems only suppot distribute hwirq to one CPU in dispite of
> >> setting it via /proc/irq/*/smp_affinity.
> >>
> >> My RK3399 platform has 6 CPUs and I was trying to bind the emmc
> >> irq, whose hwirq is 43 and virq is 30, to all cores
> >>
> >> echo 3f > /proc/irq/30/smp_affinity
> >>
> >> but the I/O test still shows the irq was fired to CPU0. For really
> >> user case, we may try to distribute different hwirqs to different cores,
> >> with the hope of distributing to a less irq-binded core as possible.
> >> Otherwise, as current implementation, gic-v3 always distribute it
> >> to the first masked cpu, which is what cpumask_any_and actually did in
> >> practice now on my platform.  
> > 
> > That's because GICv3 cannot broadcast the interrupt to all CPUs, and has
> > to pick one.
> >   
> 
> yep, that's what I got from the GIC-V3 TRM. Btw, IIRC, gic-400(belonging
> to GIC-V2) on RK3288 platform could support broadcast interrupts to all
> CPUs, so I was a bit surprised to know GIC-V3 cannot do it, as v3 sounds
> should be more powerful than v2 instinctively. :)))

The GICv2 1:N feature is really nasty, actually. It places a overhead
on all CPUs (they will all take an interrupt and only one will
actually service it, while the others may only see a spurious
interrupt). So in practice, you don't really gain anything, unless your
CPUs are completely idle. 

On a busy system, you see an actual performance reduction of your
overall throughput, for a very small benefit in latency. That's why I
refuse to support this feature in Linux. This may be useful on latency
sensitive systems where the software is too primitive to do an
effective balancing, but Linux is a bit better than that. Thankfully,
GICv3 got rid of this misfeature, and I'm making sure it won't come
back.

<rant>

Overall, interrupt affinity is too critical to be left to the hardware,
which has no knowledge of how busy the CPUs are. It is a bit like
implementing the scheduler in HW. It works very well if your SW is
minimal enough. Grow a bigger system, and HW scheduling is becoming a
total pain. That's why you only see such feature on small
microcontrollers, and not on larger CPUs.

</rant>

> 
> >>
> >> So I was thinking to record how much hwirqs are distributed to each
> >> core and try to pick up the least used one.
> >>
> >> This patch is rather rough with slightly test on my board. Just for
> >> asking advice from wisdom of your. :)  
> > 
> > My advice is not to do this in the kernel. Why don't you run something
> > like irqbalance? It will watch the interrupt usage and place move
> > interrupts around.  
> 
> I will take a look at how irqbalance work in practice. Thanks
> for your advice.

Let me know how it goes.

Thanks,

	M.
-- 
Without deviation from the norm, progress is not possible.