Hi folks, some time ago I stumbled over a very interesting blog post about agressive firewalling via pf on OpenBSD. TL;DR version: someone tries to connect to an unused port or probes around -> warm welcome to the temporary blacklist https://blog.thechases.com/posts/bsd/aggressive-pf-config-for-ssh-protection/ Around fall of last year I took it as a guiding rail and wrote my own aggressive nftables firewall (on Debian), using it quite successfully ever since. Here is my scope and the questions I have, before the wall of text rolls in. In scope: - support IPv4 & IPv6 - compact and modular configuration files - up to transport layer - inbound traffic - temporarily blacklisting potential attackers (minimizing / avoiding false positives) - rate limiting (concurrent connections / new connections per minute) - DoS mitigation - DDoS mitigation / softening (I know this will not be able to counter large scale attacks) - Linux kernel parameters - performance (runs on a VPS) Questions: - how to do new connection rate limiting, but spoofing-proof against valid addresses (at the very end of the mail)? - any glaring issues with my rules, which I just didn't run into yet? - do all rules tracking an address require me to set up a set? - should i prevent syn flooding via nftables, or are activated SYN cookies in the kernel enough? - should i have net.ipv4.conf.eth0.rp_filter set to strict or loose? (is deactivated ootb on my VPS) - if you have good idea and valid concerns, let me hear about them, always eager to improve ;) The firewall consists of /etc/nftables.conf and modules I store in /etc/nftables.d/. Modules make adjusting behavior easier, since I just need to comment out an include. This also allows me to automatically load different configurations without certain modules if there is nothing going on, or load them, if the server is currently experiencing DoS attacks (dynamically switching configuration is not in scope of this post, just wanted to mention my reasoning). All of those files/folders are "chown root:root" and "chmod 770", since I do not see a reason to allow anyone besides root to read them. The content of those files can be seen in the attached ZIP. Since I do not know if attachments go through on the mailing list (first time poster), I also have them here on pastebin: https://pastebin.com/i2xe1Byz Addresses and the SSH port in the files have been rewritten in this writeup for security/privacy reasons. The numbers used for the amount of connections etc. are of course dependent on the expected server load - treat them as guidelines, not final values. Before I go into detail about each file, here my findings after logging packets on a completely new server for 7 days (no tracking packets from my home and ones the lo interface). IPv4: 61,300 | IPv6: 26 TCP: 60,662 | UDP: 382 | ICMP: 282 Port#22: 30,980 | Port#30333: 17,204 | Port#445: 2,605 | Port#23: 225 | Port#8443: 222 As you can see, around half of packets were sent to the default SSH port... the second most one seems to be something related to crypto. /etc/nftables.conf Instead of the default "flush ruleset" I run "add table inet filter {}" followed by "delete table inet filter" (Debian does not support destroy yet). Add does not throw an error when the table already exists (daemon restart), but makes sure it does for the following delete, so it does not fail on system startup. This allows other tables than filter to remain in place, only rebuilding the latter. In theory this should allow Docker for example to still work (without restart), even when the nftables deamon is restarted or "nft -f /etc/nftables.conf" is called. The policy is "accept", but I still call accept rules for performance reasons, since evaluation is concluded early. There are only 3 rules in the main file, which are always valid: allow traffic on loopback interfaces, as well as established / related connections (server asks for Debian updates), but drop invalid packets. Everything else is loaded from the modules. /etc/nftables.d/protocol.nft Right now I only run TCP services on my server, and ICMP is needed for correct network functionality (had issues with IPv6 otherwise). Things like QUIC should work as well, since the TCP handshake goes through, and the UDP traffic afterwards should be allowed by "ct state established, related accept" in the main file (not tested). Article about ICMP: http://shouldiblockicmp.com/ /etc/nftables.d/white.nft Sets containing my address at home and my other servers. Traffic from them is allowed in general - this only goes for TCP traffic since the address is verified via handshake (someone else could send UDP packets with one of my addresses). Running these before the blacklist prevents blocking myself by accident. /etc/nftables.d/black.nft Sets containing the public ports in use by me as well as naugthy machines. Blacklisting is done in 2 rules: one checks for connections to unused ports, blacklists and then drops the connection, the other one checks if the address is in the naughty set. This temporarily blocks traffic on valid ports as well, once a bad connection attempt has been detected. At the end there is another include which tries to reload the contents of the blacklist sets (the star in the filename prevents error if they do not exist). The data for those is prepared via a cron job, which runs a Python3 script (see other files in the ZIP). Losing the contents of the temporary blacklist is not the worst thing in the world, but I prefer to be able to rebuild after a service or whole server restart. /etc/nftables.d/ct-count.nft Contains a set to limit the number of concurrent connections per IP address. It works according to my tests, but I don't quite understand how the set entries are released. Garbage collection seems to be involved for set cleaning, according to the man pages. The entries seem to automatically be deleted, once all established connections are terminated. To be honest it is a bit confusing how set cleanup happens in some instances (timeout is not even valid in those sets it seems). /etc/nftables.d/limit-rate.nft Contains a set to limit the number of new connections per IP per minute. On the surface level it works - but there is an issue in my opinion: someone can send tons of forged packets with spoofed valid IP addresses, causing innocent machines no longer be able to connect. Right now the rule uses "ct state new" for matching, but this is during a time, when the TCP handshake has not been fully performed yet, so anyone could trigger these rules. What I probably need is a rule, which only matches the first ACK package sent as response to the SYN ACK from the server (further ACK packages are sent as keep-alive packets on HTTPS for example). Maybe I need to filter for a specific sequence / ackseq, but I was not successful so far and I am really exhausted after trying unsuccessfully for days. Could also be that I need to write a rule with a lower priority - the ACK packet for concluding the handshake is already seen as an established connection it seems. Perhaps someone else has an even better idea. The key takeaway is, that I need to track each new connection per IP address, but only once and only when it is verified. Once the new connections are tracked via rate limit, I can drop new connections already at the SYN step. Best regards and thanks for your time!
<<attachment: aggressive_nftables.zip>>