Re: iptables user space performance benchmarks published

Phil Sutter <phil@xxxxxx> · Mon, 22 Jun 2020 16:35:08 +0200

Hi Jan,

On Mon, Jun 22, 2020 at 04:04:43PM +0200, Jan Engelhardt wrote:
> On Friday 2020-06-19 16:11, Phil Sutter wrote:
> >I remember you once asked for the benchmark scripts I used to compare
> >performance of iptables-nft with -legacy in terms of command overhead
> >and caching, as detailed in a blog[1] I wrote about it. I meanwhile
> >managed to polish the scripts a bit and push them into a public repo,
> >accessible here[2]. I'm not sure whether they are useful for regular
> >runs (or even CI) as a single run takes a few hours and parallel use
> >likely kills result precision.
> >
> >[1] https://developers.redhat.com/blog/2020/04/27/optimizing-iptables-nft-large-ruleset-performance-in-user-space/
> >
> >"""My main suspects for why iptables-nft performed so poorly were kernel ruleset
> >caching and the internal conversion from nftables rules in libnftnl data
> >structures to iptables rules in libxtables data structures."""
> 
> Did you record any syscall-induced latency? The classic ABI used a
> one-syscall approach, passing the entire buffer at once. With
> netlink, it's a bit of a ping-pong between user and kernel unless one
> uses mmap like on AF_PACKET — and I don't see any mmap in libmnl or
> libnftnl.

While it is true that no zero-copy mechanisms are used by
libmnl/libnftnl, an early improvement I did was to max out receive
buffer size (see commit 5a0294901db1d which also has some figures).
After all though, I would consider this to be mostly relevant when
loading a large ruleset and that is rather a one-time action, for
instance during system boot-up.

Some "quick changes" like, e.g. adding an IP to a blacklist, usually
don't need to push much data to the kernel for zero-copy to become
relevant. (Of course they may still benefit if setup overhead can be
kept low).

> Furthermore, loading the ruleset is just one aspect. Evaluating it
> for every packet is what should weigh in a lot more. Did you by
> chance collect any numbers in that regard?

Not really. I did some runtime measurements once but unless there's an
undiscovered performance loop I wouldn't expect much to improve there.

Obviously, a much larger factor is ruleset design. I guess most
existing, legacy rulesets out there would largely benefit from
introducing ipset. Duplicating the same crappy ruleset in nftables is
pointless. Making it use nftables' features after the conversion is not
trivial, but results aren't even comparable afterwards. At least that's
my quintessence from trying, see the related blog[1] for details.

Cheers, Phil

[1] https://developers.redhat.com/blog/2017/04/11/benchmarking-nftables/