On Fri, 24 Mar 2023 18:35:00 +0100 Felix Fietkau wrote: > I'm primarily testing this on routers with 2 or 4 CPUs and limited > processing power, handling routing/NAT. RPS is typically needed to > properly distribute the load across all available CPUs. When there is > only a small number of flows that are pushing a lot of traffic, a static > RPS assignment often leaves some CPUs idle, whereas others become a > bottleneck by being fully loaded. Threaded NAPI reduces this a bit, but > CPUs can become bottlenecked and fully loaded by a NAPI thread alone. The NAPI thread becomes a bottleneck with RPS enabled? > Making backlog processing threaded helps split up the processing work > even more and distribute it onto remaining idle CPUs. You'd want to have both threaded NAPI and threaded backlog enabled? > It can basically be used to make RPS a bit more dynamic and > configurable, because you can assign multiple backlog threads to a set > of CPUs and selectively steer packets from specific devices / rx queues Can you give an example? With the 4 CPU example, in case 2 queues are very busy - you're trying to make sure that the RPS does not end up landing on the same CPU as the other busy queue? > to them and allow the scheduler to take care of the rest. You trust the scheduler much more than I do, I think :)