On Mon, Sep 16, 2019 at 06:49:53PM +0200, Phil Sutter wrote: > This improves cache population quite a bit and therefore helps when > dealing with large rulesets. A simple hard to improve use-case is > listing the last rule in a large chain. These are the average program > run times depending on number of rules: > > rule count | legacy | nft old | nft new > --------------------------------------------------------- > 50,000 | .052s | .611s | .406s > 100,000 | .115s | 2.12s | 1.24s > 150,000 | .265s | 7.63s | 4.14s > 200,000 | .411s | 21.0s | 10.6s > > So while legacy iptables is still magnitudes faster, this simple change > doubles iptables-nft performance in ideal cases. > > Note that increasing the buffer even further didn't improve performance > anymore, so 32KB seems to be an upper limit in kernel space. Here are the details for this 32 KB number: commit d35c99ff77ecb2eb239731b799386f3b3637a31e Author: Eric Dumazet <edumazet@xxxxxxxxxx> Date: Thu Oct 6 04:13:18 2016 +0900 netlink: do not enter direct reclaim from netlink_dump() iproute2 is also using 32 KBytes buffer, in case you want to append this to your commit description before pushing this out. > Signed-off-by: Phil Sutter <phil@xxxxxx> Acked-by: Pablo Neira Ayuso <pablo@xxxxxxxxxxxxx>