Re: PCI, isolcpus, and irq affinity

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 12 Oct 2020, at 9:58, Bjorn Helgaas wrote:

[+cc Christoph, Thomas, Nitesh]

On Mon, Oct 12, 2020 at 09:49:37AM -0600, Chris Friesen wrote:
I've got a linux system running the RT kernel with threaded irqs.  On startup we affine the various irq threads to the housekeeping CPUs, but I recently hit a scenario where after some days of uptime we ended up with a number of NVME irq threads affined to application cores instead (not good
when we're trying to run low-latency applications).

pci_alloc_irq_vectors_affinity() basically just passes affinity
information through to kernel/irq/affinity.c, and the PCI core doesn't
change affinity after that.

Looking at the code, it appears that the NVME driver can in some scenarios
end up calling pci_alloc_irq_vectors_affinity() after initial system
startup, which seems to determine CPU affinity without any regard for things
like "isolcpus" or "cset shield".

There seem to be other reports of similar issues:

https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1831566

It looks like some SCSI drivers and virtio_pci_common.c will also call pci_alloc_irq_vectors_affinity(), though I'm not sure if they would ever do
it after system startup.

How does it make sense for the PCI subsystem to affine interrupts to CPUs
which have explicitly been designated as "isolated"?

This recent thread may be useful:

  https://lore.kernel.org/linux-pci/20200928183529.471328-1-nitesh@xxxxxxxxxx/

It contains a patch to "Limit pci_alloc_irq_vectors() to housekeeping
CPUs".  I'm not sure that patch summary is 100% accurate because IIUC
that particular patch only reduces the *number* of vectors allocated
and does not actually *limit* them to housekeeping CPUs.

Bjorn


Chris,

Are you attempting a tick-less run? I’ve seen the NO_HZ_FULL (full dynticks) feature behave somewhat inconsistently when PREEMPT_RT is enabled. The timer ticks suppression feature can at times appear to be not functioning. I’m curious about how you are attempting to isolate the cores.

Thanks,

Sean





[Index of Archives]     [DMA Engine]     [Linux Coverity]     [Linux USB]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [Greybus]

  Powered by Linux