Hi Florian, On Fri, Oct 02, 2020 at 12:25:36AM +0200, Florian Westphal wrote: > Phil Sutter <phil@xxxxxx> wrote: > > The following two patches improve packet throughput in a test setup > > sending UDP packets (using iperf3) between two netns. The ruleset used > > on receiver side is like this: > > > > | *filter > > | :test - [0:0] > > | -A INPUT -j test > > | -A INPUT -j ACCEPT > > | -A test ! -s 10.0.0.0/10 -j DROP # this line repeats 10000 times > > | COMMIT > > > > These are the generated VM instructions for each rule: > > > > | [ payload load 4b @ network header + 12 => reg 1 ] > > | [ bitwise reg 1 = (reg=1 & 0x0000c0ff ) ^ 0x00000000 ] > > Not related to this patch, but we should avoid the bitop if the > netmask is divisble by 8 (can adjust the cmp -- adjusting the > payload expr is probably not worth it). See the patch I just sent to this list. I adjusted both - it simply didn't appear to me that I could get by with reducing the cmp expression size only. The upside though is that detecting the prefix match based on payload expression length is quick and easy. Someone will have to adjust nft tool, though. ;) > > | [ cmp eq reg 1 0x0000000a ] > > | [ counter pkts 0 bytes 0 ] > > Out of curiosity, does omitting 'counter' help? > > nft counter is rather expensive due to bh disable, > iptables does it once at the evaluation loop only. I changed the test to create the base ruleset using iptables-nft-restore just as before, but create the rules in 'test' chain like so: | nft add rule filter test ip saddr != 10.0.0.0/10 drop The VM code is as expected: | [ payload load 4b @ network header + 12 => reg 1 ] | [ bitwise reg 1 = (reg=1 & 0x0000c0ff ) ^ 0x00000000 ] | [ cmp eq reg 1 0x0000000a ] | [ immediate reg 0 drop ] Performance is ~7000pkt/s. So while it's faster than iptables-nft, it's still quite a bit slower than legacy iptables despite the skipped counters. Cheers, Phil