On 1/31/2019 4:41 PM, Logan Gunthorpe wrote:
On 2019-01-31 3:46 p.m., Dave Jiang wrote:
I believe irqbalance writes to the file /proc/irq/N/smp_affinity. So
maybe take a look at the code that starts from there and see if it would
have any impact on your stuff.
Ok, well on my system I can write to the smp_affinity all day and the
MSI interrupts still work fine.
Maybe your code is ok then. If the stats show up in /proc/interrupts
then you can see it moving to different cores.
The MSI code is a bit difficult to trace and audit with all the
different chips and the parent chips which I don't have a good
understanding of. But I can definitely see that it could be possible for
some chips to change the address as smp_affinitiy will eventually
sometimes call msi_domain_set_affinity() which does seem to recompose
the message and write it back to the chip.
So, I could relatively easily add a callback to msi_desc to catch this
and resend the MSI address/data. However, I'm not sure how this is ever
done atomically. It seems like there would be a race while the device
updates its address where old interrupts could be triggered. This race
would be much longer for us when sending this information over the NTB
link. Though, I guess if the only change is that it encodes CPU
information in the address then that would not be an issue. However, I'm
not sure I can say that for certain without a comprehensive
understanding of all the IRQ chips.
Any thoughts on this?
Yeah I'm not sure what to do about it either as I'm not super familiar
with that area either. Just making note of what I encountered. And you
are right, the updated info has to go over NTB for the other side to
write to the updated place. So there's a lot of latency involved.
Logan