On 2023-07-04 12:29:33 [+0200], Paolo Abeni wrote: > Just to hopefully clarify the networking side of it, napi instances != > network backlog (used by RPS). The network backlog (RPS) is available > for all the network devices, including the loopback and all the virtual > ones. Yes. > The napi instances (and the threaded mode) are available only on > network device drivers implementing the napi model. The loopback driver > does not implement the napi model, as most virtual devices and even > some H/W NICs (mostily low end ones). Yes. > The network backlog can't run in threaded mode: there is no API/sysctl > nor infrastructure for that. The backlog processing threaded mode could > be implemented, even if should not be completely trivial and it sounds > a bit weird to me. Yes, I mean that this needs to be done. > > Just for the records, I mentioned the following in the bz: > > It looks like flush_smp_call_function_queue() has 2 only callers, > migration, and do_idle(). > > What about moving softirq processing from > flush_smp_call_function_queue() into cpu_stopper_thread(), outside the > unpreemptable critical section? This doesn't solve anything. You schedule softirq from hardirq and from this moment on you are in "anonymous context" and we solve this by processing it in ksoftirqd. For !RT you process it while leaving the hardirq. For RT, we can't. Processing it in the context of the currently running process (say idle as in the reported backtrace or an another running user task) would lead to processing network related that originated somewhere at someone else's expense. Assume you have a high prio RT task running, not related to networking at all, and suddenly you throw a bunch of skbs on it. Therefore it is preferred to process them within the interrupt thread in which the softirq was raised/ within its origin. The other problem with ksoftirqd processing is that everything is added to a global state and then left for ksoftirqd to process. The global state is considered by every local_bh_enable() instance so random interrupt thread could process it or even a random task doing a syscall involving spin_lock_bh(). The NAPI-threads are nice in a way that they don't clobber the global state. For RPS we would need either per-CPU threads or serve this in ksoftirqd/X. The additional thread per-CPU makes only sense if it runs at higher priority. However without the priority it would be no different to ksoftirqd unless it does only the backlog's work. puh. I'm undecided here. We might want to throw it into ksoftirqd, remove the warning. But then this will be processed with other softirqs (like USB due to tasklet) and at some point and might be picked up by another interrupt thread. > Cheers, > > Paolo Sebastian