On 4/7/21 11:18 AM, Nitesh Narayan Lal wrote: > On 4/6/21 1:22 PM, Jesse Brandeburg wrote: >> Continuing a thread from a bit ago... >> >> Nitesh Narayan Lal wrote: >> >>>> After a little more digging, I found out why cpumask_local_spread change >>>> affects the general/initial smp_affinity for certain device IRQs. >>>> >>>> After the introduction of the commit: >>>> >>>> e2e64a932 genirq: Set initial affinity in irq_set_affinity_hint() >>>> >>> Continuing the conversation about the above commit and adding Jesse. >>> I was trying to understand the problem that the commit message explains >>> "The default behavior of the kernel is somewhat undesirable as all >>> requested interrupts end up on CPU0 after registration.", I have also been >>> trying to reproduce this behavior without the patch but I failed in doing >>> so, maybe because I am missing something here. >>> >>> @Jesse Can you please explain? FWIU IRQ affinity should be decided based on >>> the default affinity mask. > Thanks, Jesse for responding. > >> The original issue as seen, was that if you rmmod/insmod a driver >> *without* irqbalance running, the default irq mask is -1, which means >> any CPU. The older kernels (this issue was patched in 2014) used to use >> that affinity mask, but the value programmed into all the interrupt >> registers "actual affinity" would end up delivering all interrupts to >> CPU0, > So does that mean the affinity mask for the IRQs was different wrt where > the IRQs were actually delivered? > Or, the affinity mask itself for the IRQs after rmmod, insmod was changed > to 0 instead of -1? > > I did a quick test on top of 5.12.0-rc6 by comparing the i40e IRQ affinity > mask before removing the kernel module and after doing rmmod+insmod > and didn't find any difference. > >> and if the machine was under traffic load incoming when the >> driver loaded, CPU0 would start to poll among all the different netdev >> queues, all on CPU0. >> >> The above then leads to the condition that the device is stuck polling >> even if the affinity gets updated from user space, and the polling will >> continue until traffic stops. >> >>> The problem with the commit is that when we overwrite the affinity mask >>> based on the hinting mask we completely ignore the default SMP affinity >>> mask. If we do want to overwrite the affinity based on the hint mask we >>> should atleast consider the default SMP affinity. > For the issue where the IRQs don't follow the default_smp_affinity mask > because of this patch, the following are the steps by which it can be easily > reproduced with the latest linux kernel: > > # Kernel > 5.12.0-rc6+ > > # Other pramaeters in the cmdline > isolcpus=2-39,44-79 nohz=on nohz_full=2-39,44-79 > rcu_nocbs=2-39,44-79 > > # cat /proc/irq/default_smp_affinity > 0000,00000f00,00000003 [Corresponds to HK CPUs - 0, 1, 40, 41, 42 and 43] > > # Create VFs and check IRQ affinity mask > > /proc/irq/1423/iavf-ens1f1v3-TxRx-3 > 3 > /proc/irq/1424/iavf-0000:3b:0b.0:mbx > 0 > 40 > 42 > /proc/irq/1425/iavf-ens1f1v8-TxRx-0 > 0 > /proc/irq/1426/iavf-ens1f1v8-TxRx-1 > 1 > /proc/irq/1427/iavf-ens1f1v8-TxRx-2 > 2 > /proc/irq/1428/iavf-ens1f1v8-TxRx-3 > 3 > ... > /proc/irq/1475/iavf-ens1f1v15-TxRx-0 > 0 > /proc/irq/1476/iavf-ens1f1v15-TxRx-1 > 1 > /proc/irq/1477/iavf-ens1f1v15-TxRx-2 > 2 > /proc/irq/1478/iavf-ens1f1v15-TxRx-3 > 3 > /proc/irq/1479/iavf-0000:3b:0a.0:mbx > 0 > 40 > 42 > ... > /proc/irq/240/iavf-ens1f1v3-TxRx-0 > 0 > /proc/irq/248/iavf-ens1f1v3-TxRx-1 > 1 > /proc/irq/249/iavf-ens1f1v3-TxRx-2 > 2 > > > Trace dump: > ---------- > .. > 11551082: NetworkManager-1734 [040] 8167.465719: vector_activate: > irq=1478 is_managed=0 can_reserve=1 reserve=0 > 11551090: NetworkManager-1734 [040] 8167.465720: vector_alloc: > irq=1478 vector=65 reserved=1 ret=0 > 11551093: NetworkManager-1734 [040] 8167.465721: vector_update: > irq=1478 vector=65 cpu=42 prev_vector=0 prev_cpu=0 > 11551097: NetworkManager-1734 [040] 8167.465721: vector_config: > irq=1478 vector=65 cpu=42 apicdest=0x00000200 > 11551357: NetworkManager-1734 [040] 8167.465768: vector_alloc: > irq=1478 vector=46 reserved=0 ret=0 > > 11551360: NetworkManager-1734 [040] 8167.465769: vector_update: > irq=1478 vector=46 cpu=3 prev_vector=65 prev_cpu=42 > > 11551364: NetworkManager-1734 [040] 8167.465770: vector_config: > irq=1478 vector=46 cpu=3 apicdest=0x00040100 > .. > > As we can see in the above trace the initial affinity for the IRQ 1478 was > correctly set as per the default_smp_affinity mask which includes CPU 42, > however, later on, it is updated with CPU3 which is returned from > cpumask_local_spread(). > >> Maybe the right thing is to fix which CPUs are passed in as the valid >> mask, or make sure the kernel cross checks that what the driver asks >> for is a "valid CPU"? >> > Sure, if we can still reproduce the problem that your patch was fixing then > maybe we can consider adding a new API like cpumask_local_spread_irq in > which we should consider deafult_smp_affinity mask as well before returning > the CPU. > Didn't realize that netdev ml was not included, so adding that. -- Nitesh