On Wed, 4 Apr 2018, Ming Lei wrote: > On Tue, Apr 03, 2018 at 03:32:21PM +0200, Thomas Gleixner wrote: > > On Thu, 8 Mar 2018, Ming Lei wrote: > > > 1) before 84676c1f21 ("genirq/affinity: assign vectors to all possible CPUs") > > > irq 39, cpu list 0 > > > irq 40, cpu list 1 > > > irq 41, cpu list 2 > > > irq 42, cpu list 3 > > > > > > 2) after 84676c1f21 ("genirq/affinity: assign vectors to all possible CPUs") > > > irq 39, cpu list 0-2 > > > irq 40, cpu list 3-4,6 > > > irq 41, cpu list 5 > > > irq 42, cpu list 7 > > > > > > 3) after applying this patch against V4.15+: > > > irq 39, cpu list 0,4 > > > irq 40, cpu list 1,6 > > > irq 41, cpu list 2,5 > > > irq 42, cpu list 3,7 > > > > That's more or less window dressing. If the device is already in use when > > the offline CPUs get hot plugged, then the interrupts still stay on cpu 0-3 > > because the effective affinity of interrupts on X86 (and other > > architectures) is always a single CPU. > > > > So this only might move interrupts to the hotplugged CPUs when the device > > is initialized after CPU hotplug and the actual vector allocation moves an > > interrupt out to the higher numbered CPUs if they have less vectors > > allocated than the lower numbered ones. > > It works for blk-mq devices, such as NVMe. > > Now NVMe driver creates num_possible_cpus() hw queues, and each > hw queue is assigned one msix irq vector. > > Storage is Client/Server model, that means the interrupt is only > delivered to CPU after one IO request is submitted to hw queue and > it is completed by this hw queue. > > When CPUs is hotplugged, and there will be IO submitted from these > CPUs, then finally IOs complete and irq events are generated from > hw queues, and notify these submission CPU by IRQ finally. I'm aware how that hw-queue stuff works. But that only works if the spreading algorithm makes the interrupts affine to offline/not-present CPUs when the block device is initialized. In the example above: > > > irq 39, cpu list 0,4 > > > irq 40, cpu list 1,6 > > > irq 41, cpu list 2,5 > > > irq 42, cpu list 3,7 and assumed that at driver init time only CPU 0-3 are online then the hotplug of CPU 4-7 will not result in any interrupt delivered to CPU 4-7. So the extra assignment to CPU 4-7 in the affinity mask has no effect whatsoever and even if the spreading result is 'perfect' it just looks perfect as it is not making any difference versus the original result: > > > irq 39, cpu list 0 > > > irq 40, cpu list 1 > > > irq 41, cpu list 2 > > > irq 42, cpu list 3 Thanks, tglx