Hi Jan, On Mon, Jun 22, 2020 at 04:04:43PM +0200, Jan Engelhardt wrote: > On Friday 2020-06-19 16:11, Phil Sutter wrote: > >I remember you once asked for the benchmark scripts I used to compare > >performance of iptables-nft with -legacy in terms of command overhead > >and caching, as detailed in a blog[1] I wrote about it. I meanwhile > >managed to polish the scripts a bit and push them into a public repo, > >accessible here[2]. I'm not sure whether they are useful for regular > >runs (or even CI) as a single run takes a few hours and parallel use > >likely kills result precision. > > > >[1] https://developers.redhat.com/blog/2020/04/27/optimizing-iptables-nft-large-ruleset-performance-in-user-space/ > > > >"""My main suspects for why iptables-nft performed so poorly were kernel ruleset > >caching and the internal conversion from nftables rules in libnftnl data > >structures to iptables rules in libxtables data structures.""" > > Did you record any syscall-induced latency? The classic ABI used a > one-syscall approach, passing the entire buffer at once. With > netlink, it's a bit of a ping-pong between user and kernel unless one > uses mmap like on AF_PACKET — and I don't see any mmap in libmnl or > libnftnl. While it is true that no zero-copy mechanisms are used by libmnl/libnftnl, an early improvement I did was to max out receive buffer size (see commit 5a0294901db1d which also has some figures). After all though, I would consider this to be mostly relevant when loading a large ruleset and that is rather a one-time action, for instance during system boot-up. Some "quick changes" like, e.g. adding an IP to a blacklist, usually don't need to push much data to the kernel for zero-copy to become relevant. (Of course they may still benefit if setup overhead can be kept low). > Furthermore, loading the ruleset is just one aspect. Evaluating it > for every packet is what should weigh in a lot more. Did you by > chance collect any numbers in that regard? Not really. I did some runtime measurements once but unless there's an undiscovered performance loop I wouldn't expect much to improve there. Obviously, a much larger factor is ruleset design. I guess most existing, legacy rulesets out there would largely benefit from introducing ipset. Duplicating the same crappy ruleset in nftables is pointless. Making it use nftables' features after the conversion is not trivial, but results aren't even comparable afterwards. At least that's my quintessence from trying, see the related blog[1] for details. Cheers, Phil [1] https://developers.redhat.com/blog/2017/04/11/benchmarking-nftables/