On Wed, Feb 28, 2024 at 9:35 AM Jakub Kicinski <kuba@xxxxxxxxxx> wrote: > > On Wed, 28 Feb 2024 07:15:42 -0800 Paul E. McKenney wrote: > > > > Another complication is that although CONFIG_PREEMPT_RT kernels are > > > > built with CONFIG_PREEMPT_RCU, the reverse is not always the case. > > > > And if we are not repolling, don't we have a high probability of doing > > > > a voluntary context when we reach napi_thread_wait() at the beginning > > > > of that loop? > > > > > > Very much so, which is why adding the cost of rcu_softirq_qs() > > > for every NAPI run feels like an overkill. > > > > Would it be better to do the rcu_softirq_qs() only once every 1000 times > > or some such? Or once every HZ jiffies? > > > > Or is there a better way? > > Right, we can do that. Yan Zhai, have you measured the performance > impact / time spent in the call? For the case it hits the problem, the __napi_poll itself is usually consuming much of the cycles, so I didn't notice any difference in terms of tput. And it is in fact repolling all the time as the customer traffic might not implement proper backoff. So using a loop counter or jiffies to cap the number of invocations sounds like a decent improvement. Let me briefly check the overhead in the normal case, too Yan