On Wed, 2024-07-31 at 02:00 +0100, Pavel Begunkov wrote: > > I forgot to add, ~50 switches/second for relatively brief RCU > handling > is not much, not enough to take 50% of a CPU. I wonder if sqpoll was > still running but napi busy polling time got accounted to softirq > because of disabled bh and you didn't include it, hence asking CPU > stats. Do you see any latency problems for that configuration? > Pavel, I am not sure if I will ever discover what this 50% CPU usage drop was exactly. when I did test https://lore.kernel.org/io-uring/382791dc97d208d88ee31e5ebb5b661a0453fb79.1722374371.git.olivier@xxxxxxxxxxxxxx/T/#u from this custom setup: https://github.com/axboe/liburing/issues/1190#issuecomment-2258632731 iou-sqp task cpu usage went back to 100%... there was also my busy_poll config numbers that were inadequate. I went from: echo 1000 > /sys/class/net/enp39s0/napi_defer_hard_irqs echo 500 > /sys/class/net/enp39s0/gro_flush_timeout to: echo 5000 > /sys/class/net/enp39s0/napi_defer_hard_irqs # gro_flush_timeout unit is nanoseconds echo 100000 > /sys/class/net/enp39s0/gro_flush_timeout ksoftirqd has stopped being awakening to service NET SOFTIRQS but I would that this might not be the cause neither I have no more latency issues. After a lot of efforts during the last 7 days, my system latency have improved by a good 10usec on average over what it was last week... but knowing that it can be even better is stopping me from letting go... the sporadic CPU1 interrupt can introduce a 27usec delay and this is the difference between a win or a loss that is at stake... https://lore.kernel.org/rcu/367dc07b740637f2ce0298c8f19f8aec0bdec123.camel@xxxxxxxxxxxxxx/T/#m5abf9aa02ec7648c615885a6f8ebdebc57935c35 I want to get rid of that interrupt so hard that is going to provide a great satidfaction when I will have finally found the cause...