Re: [Thread split] nftables rule optimization - dropping invalid in ingress?

"William N." <netfilter@xxxxxxxxxx> · Sun, 21 Apr 2024 17:47:26 -0000

On Sat, 20 Apr 2024 20:16:49 +0100 Kerin Millar wrote:

> If using the ingress hook in this way is to make any measurable
> difference to your load average at all, my expectation would be for
> it be observable in the event that you are subjected to a
> concentrated flood of invalid TCP packets.

I was thinking the same.

> You could use hping3 to conduct a series of stress tests.

The question is how to measure the difference correctly. I am not a
network expert. I see some articles about firewall performance testing
using  iperf3 but in a VM-to-VM test (which I use) it is quite
inconsistent. Additionally, iperf3 is not able to do what hping3
does and I have no idea how to measure anything using hping3, which is
probably the correct test, as it would trigger the rules.

My clumsy attempt to have at least some comparison:

Testing procedure
-----------------

1. Design the ruleset
2. Put the rule to be tested in a separate file and include that file
3. In the file, have the line with the rule, followed by a single
'continue' on a new line. Repeat this combination 1000 times (to make
it have an observable effect)
4. In VM1 run (where rules reside):

iperf3 -s -p <port>
5. In VM2 run:

for (( i=0; i<10; i++)); do iperf3 -c 10.137.0.82 -p <port> -V; done

6. Note the speed after each iteration.

I am doing it 10 times because my observation shows that the first
iteration is always faster and the next ones decline to a point of
"saturation" (my interpretation - some resource exhaustion, e.g. memory
buffer). That is the inconsistency, mentioned above and my attempt to
get rid of it. Additionally, I also tested 'time ./my-firewall' to
compare this too.

Results
-------

# Early drop:

time to load = 23 sec
Bitrates (in Gbits/sec):

3.53
2.88
2.92
2.91
2.66
2.80
2.83
2.79
2.83
2.82

# Drop in prerouting (using conntrack and 'invalid'):

time to load = 1.326 s
Bitrates:

4.19
3.48
3.46
3.50
3.53
3.52
3.51
3.48
3.42
3.40

For comparison, the speed without any ruleset (after 'nft flush
ruleset'):

Bitrates:

5.54
4.96
4.66
4.33
4.32
4.32
4.33
4.34
4.31
4.32

Summarized
----------

Given the rules are repeated 1000 times, early drop is ~82% of the
speed of dropping using conntrack in prerouting and takes x20 more time
to load.

My interpretation (may be completely wrong, so corrections are welcome)
-----------------

conntrack is faster because it is compiled (binary), i.e. the same rule
written in nftables syntax needs additional processing (hence the much
higher load time), which makes it less efficient.

Additional info
---------------

If the rule is set only once (not repeated 1000 times, as in the test)
there is no difference between the 3 cases above - same speed. So,
ceteris paribus, the question comes down to: is there a security
benefit in dropping invalid packets earlier (e.g. right after dropping
fragments and bogons)?

I would highly appreciate the thoughts of the experts here.