On 30.03.23 13:01, Paolo Abeni wrote:
On Tue, 2023-03-28 at 16:16 -0700, Jakub Kicinski wrote:
On Tue, 28 Mar 2023 21:59:25 +0200 Felix Fietkau wrote:
> When dealing with few flows or an imbalance on CPU utilization, static RPS
> CPU assignment can be too inflexible. Add support for enabling threaded NAPI
> for backlog processing in order to allow the scheduler to better balance
> processing. This helps better spread the load across idle CPUs.
Can you share some numbers vs a system where RPS only spreads to
the cores which are not running NAPI?
IMHO you're putting a lot of faith in the scheduler and you need
to show that it actually does what you say it will do.
I will run some more tests as soon as I have time for it.
I have the same feeling. From your description I think some gain is
possible if there are no other processes running except
ksoftirq/rps/threaded napi.
I guess that the above is expect average state for a small s/w router,
but if/when routing daemon/igmp proxy/local web server kicks-in you
should notice a measurable higher latency (compared to plain RPS in the
same scenario)???
Depends on the process priority, I guess.
The main thing I'm trying to fix is the fact that RPS as implemented
right now is too static for devices routing traffic at CPU capacity limit.
Even if you manage to tune properly for simple ethernet NAT, then adding
WLAN to the mix can easily throw a wrench into the picture as well,
because its hard to cover different shifting usage patterns with a
simple static assignment.
- Felix