On Fri, 2024-08-02 at 16:22 +0100, Pavel Begunkov wrote: > > > > I am definitely interested in running the profiler tools that you > > are > > proposing... Most of my problems are resolved... > > > > - I got rid of 99.9% if the NET_RX_SOFTIRQ > > - I have reduced significantly the number of NET_TX_SOFTIRQ > > https://github.com/amzn/amzn-drivers/issues/316 > > - No more rcu context switches > > - CPU2 is now nohz_full all the time > > - CPU1 local timer interrupt is raised once every 2-3 seconds for > > an > > unknown origin. Paul E. McKenney did offer me his assistance on > > this > > issue > > https://lore.kernel.org/rcu/367dc07b740637f2ce0298c8f19f8aec0bdec123.camel@xxxxxxxxxxxxxx/t/#u > > And I was just going to propose to ask Paul, but great to > see you beat me on that > My investigation has progressed... my cpu1 interrupts are nvme block device interrupts. I feel that for questions about block device drivers, this time, I am ringing at the experts door! What is the meaning of a nvme interrupt? I am assuming that this is to signal the completing of writing blocks in the device... I am currently looking in the code to find the answer for this. Next, it seems to me that there is an odd number of interrupts for the device: 63: 12 0 0 0 PCI-MSIX-0000:00:04.0 0-edge nvme0q0 64: 0 23336 0 0 PCI-MSIX-0000:00:04.0 1-edge nvme0q1 65: 0 0 0 33878 PCI-MSIX-0000:00:04.0 2-edge nvme0q2 why 3? Why not 4? one for each CPU... If there was 4, I would have concluded that the driver has created a queue for each CPU... How are the queues associated to certain request/task? The file I/O is made by threads running on CPU3, so I find it surprising that nvmeq1 is choosen... One noteworthy detail is that the process main thread is on CPU1. In my flawed mental model of 1 queue per CPU, there could be some sort of magical association with a process file descriptors table and the choosen block device queue but this idea does not hold... What would happen to processes running on CPU2...