Re: PCI, isolcpus, and irq affinity

Thomas Gleixner <tglx@xxxxxxxxxxxxx> · Mon, 12 Oct 2020 21:44:49 +0200

On Mon, Oct 12 2020 at 12:07, Keith Busch wrote:

> On Mon, Oct 12, 2020 at 12:58:41PM -0600, Chris Friesen wrote:
>> On 10/12/2020 11:50 AM, Thomas Gleixner wrote:
>> > On Mon, Oct 12 2020 at 11:58, Bjorn Helgaas wrote:
>> > > On Mon, Oct 12, 2020 at 09:49:37AM -0600, Chris Friesen wrote:
>> > > > I've got a linux system running the RT kernel with threaded irqs.  On
>> > > > startup we affine the various irq threads to the housekeeping CPUs, but I
>> > > > recently hit a scenario where after some days of uptime we ended up with a
>> > > > number of NVME irq threads affined to application cores instead (not good
>> > > > when we're trying to run low-latency applications).
>> > 
>> > These threads and the associated interupt vectors are completely
>> > harmless and fully idle as long as there is nothing on those isolated
>> > CPUs which does disk I/O.
>> 
>> Some of the irq threads are affined (by the kernel presumably) to multiple
>> CPUs (nvme1q2 and nvme0q2 were both affined 0x38000038, a couple of other
>> queues were affined 0x1c00001c0).
>
> That means you have more CPUs than your controller has queues. When that
> happens, some sharing of the queue resources among CPUs is required.
>  
>> In this case could disk I/O submitted by one of those CPUs end up
>> interrupting another one?
>
> If you dispatch IO from any CPU in the mask, then the completion side
> wakes the thread to run on one of the CPUs in the affinity mask.

Pre 4.17, yes.

>From 4.17 onwards the irq thread is following the effective affinity of
the hardware interrupt which is a single CPU target.

Since 5.6 the effective affinity is steered to a housekeeping CPU if the
cpumask of a queue spawns multiple CPUs.

Thanks,

        tglx