Hi,
I'm not subscribed to the list so please CC me on replies.
I've got a linux system running the RT kernel with threaded irqs. On
startup we affine the various irq threads to the housekeeping CPUs, but
I recently hit a scenario where after some days of uptime we ended up
with a number of NVME irq threads affined to application cores instead
(not good when we're trying to run low-latency applications).
Looking at the code, it appears that the NVME driver can in some
scenarios end up calling pci_alloc_irq_vectors_affinity() after initial
system startup, which seems to determine CPU affinity without any regard
for things like "isolcpus" or "cset shield".
There seem to be other reports of similar issues:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1831566
It looks like some SCSI drivers and virtio_pci_common.c will also call
pci_alloc_irq_vectors_affinity(), though I'm not sure if they would ever
do it after system startup.
How does it make sense for the PCI subsystem to affine interrupts to
CPUs which have explicitly been designated as "isolated"?
Thanks,
Chris