[+cc Christoph, Thomas, Nitesh] On Mon, Oct 12, 2020 at 09:49:37AM -0600, Chris Friesen wrote: > I've got a linux system running the RT kernel with threaded irqs. On > startup we affine the various irq threads to the housekeeping CPUs, but I > recently hit a scenario where after some days of uptime we ended up with a > number of NVME irq threads affined to application cores instead (not good > when we're trying to run low-latency applications). pci_alloc_irq_vectors_affinity() basically just passes affinity information through to kernel/irq/affinity.c, and the PCI core doesn't change affinity after that. > Looking at the code, it appears that the NVME driver can in some scenarios > end up calling pci_alloc_irq_vectors_affinity() after initial system > startup, which seems to determine CPU affinity without any regard for things > like "isolcpus" or "cset shield". > > There seem to be other reports of similar issues: > > https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1831566 > > It looks like some SCSI drivers and virtio_pci_common.c will also call > pci_alloc_irq_vectors_affinity(), though I'm not sure if they would ever do > it after system startup. > > How does it make sense for the PCI subsystem to affine interrupts to CPUs > which have explicitly been designated as "isolated"? This recent thread may be useful: https://lore.kernel.org/linux-pci/20200928183529.471328-1-nitesh@xxxxxxxxxx/ It contains a patch to "Limit pci_alloc_irq_vectors() to housekeeping CPUs". I'm not sure that patch summary is 100% accurate because IIUC that particular patch only reduces the *number* of vectors allocated and does not actually *limit* them to housekeeping CPUs. Bjorn