On 1/31/2019 3:39 PM, Logan Gunthorpe wrote:
On 2019-01-31 1:58 p.m., Dave Jiang wrote:
On 1/31/2019 1:48 PM, Logan Gunthorpe wrote:
On 2019-01-31 1:20 p.m., Dave Jiang wrote:
Does this work when the system moves the MSI vector either via software
(irqbalance) or BIOS APIC programming (some modes cause round robin
behavior)?
I don't know how irqbalance works, and I'm not sure what you are
referring to by BIOS APIC programming, however I would expect these
things would not be a problem.
The MSI code I'm presenting here doesn't do anything crazy with the
interrupts, it allocates and uses them just as any PCI driver would. The
only real difference here is that instead of a piece of hardware sending
the IRQ TLP, it will be sent through the memory window (which, from the
OS's perspective, is just coming from an NTB hardware proxy alias).
Logan
Right. I did that as a hack a while back for some silicon errata
workaround. When the vector moves, the address for the LAPIC changes. So
unless it gets updated, you end up writing to the old location and lose
all the new interrupts. irqbalance is a user daemon that rotates the
system interrupts around to ensure that not all interrupts are pinned on
a single core.
Yes, that would be a problem if something changes the MSI vectors out
from under us. Seems like that would be a bit difficult to do even with
regular hardware. So far I haven't seen anything that would do that. If
you know of where in the kernel this happens I'd be interested in
getting a pointer to the flow in the code. If that is the case this MSI
stuff will need to get much more complicated...
I believe irqbalance writes to the file /proc/irq/N/smp_affinity. So
maybe take a look at the code that starts from there and see if it would
have any impact on your stuff.
I think it's enabled by default on several distros.
Although MSIX has nothing to do with the IOAPIC, the mode that the APIC
is programmed can have an influence on how the interrupts are delivered.
There are certain Intel platforms (I don't know if AMD does anything
like that) puts the IOAPIC in a certain configuration that causes the
interrupts to be moved in a round robin fashion. I think it's physical
flat mode? I don't quite recall. Normally on the low end Xeons. It's
probably worth doing a test run with the irqbalance daemon running and
make sure you traffic stream doesn't all of sudden stop.
I've tested with irqbalance running and haven't found any noticeable
difference.
Logan