On Fri, May 17, 2024 at 01:30:01PM -0400, Laine Stump wrote: > Support using nftables to setup the firewall for each virtual network, > rather than iptables. The initial implementation of the nftables > backend creates (almost) exactly the same ruleset as the iptables > backend, determined by running the following commands on a host that > has an active virtual network: > > iptables-save >iptables.txt > iptables-restore-translate -f iptables.txt > > (and the similar ip6tables-save/ip6tables-restore-translate for an > IPv6 network). Correctness of the new backend was checked by comparing > the output of: > > nft list ruleset > > when the backend is set to iptables and when it is set to nftables. > > This page was used as a guide: > > https://wiki.nftables.org/wiki-nftables/index.php/Moving_from_iptables_to_nftables > > The only differences between the rules created by the nftables backed > vs. the iptables backend (aside from a few inconsequential changes in > display order of some chains/options) are: > > 1) When we add nftables rules, rather than adding them in the > system-created "filter" and "nat" tables, we add them in a private > table (ie only we should be using it) created by us called "libvirt" > (the system-created "filter" and "nat" tables can't be used because > adding any rules to those tables directly with nft will cause failure > of any legacy application attempting to use iptables when it tries to > list the iptables rules (e.g. "iptables -S"). > > (NB: in nftables only a single table is required for both nat and > filter rules - the chains for each are differentiated by specifying > different "hook" locations for the toplevel chain of each) > > 2) Since the rules that were added to allow tftp/dns/dhcp traffic from > the guests to the host are unnecessary in the context of nftables, > those rules aren't added. > > (Longer explanation: In the case of iptables, all rules were in a > single table, and it was always assumed that there would be some > "catch-all" REJECT rule added by "someone else" in the case that a > packet didn't match any specific rules, so libvirt added these > specific rules to ensure that, no matter what other rules were added > by any other subsystem, the guests would still have functional > tftp/dns/dhcp. For nftables though, the rules added by each subsystem > are in a separate table, and in order for traffic to be accepted, it > must be accepted by *all* tables, so just adding the specific rules to > libvirt's table doesn't help anything (as the default for the libvirt > table is ACCEPT anyway) and it just isn't practical/possible for > libvirt to find *all* other tables and add rules in all of them to > make sure the traffic is accepted. libvirt does this for firewalld (it > creates a "libvirt" zone that allows tftp/dns/dhcp, and adds all > virtual network bridges to that zone), however, so in that case no > extra work is required of the sysadmin.) > > 3) nftables doesn't support the "checksum mangle" rule (or any > equivalent functionality) that we have historically added to our > iptables rules, so the nftables rules we add have nothing related to > checksum mangling. > > (NB: The result of (3) is that if you a) have a very old guest (RHEL5 > era or earlier) and b) that guest is using a virtio-net network > device, and c) the virtio-net device is using vhost packet processing > (the default) then DHCP on the guest will fail. You can work around > this by adding <driver name='qemu'/> to the <interface> XML for the > guest). > > There are certainly much better nftables rulesets that could be used > instead of those implemented here, and everything is in place to make > future changes to the rules that are used simple and free of surprises > (e.g. the rules that are added have coresponding "removal" commands > added to the network status so that we will always remove exactly the > rules that were previously added rather than trying to remove the > rules that "the current build of libvirt would have added" (which will > be incorrect the first time we run a libvirt with a newly modified > ruleset). For this initial implementation though, I wanted the > nftables rules to be as identical to the iptables rules as possible, > just to make it easier to verify that everything is working. > > The backend can be manually chosen using the firewall_backend setting > in /etc/libvirt/network.conf. libvirtd/virtnetworkd will read this > setting when it starts; if there is no explicit setting, it will check > for availability of FIREWALL_BACKEND_DEFAULT_1 and then > FIREWALL_BACKEND_DEFAULT_2 (which are set at build time in > meson_options.txt or by adding -Dfirewall_backend_default_n=blah to > the meson commandline), and use the first backend that is available > (ie, that has the necessary programs installed). The standard > meson_options.txt is set to check for nftables first, and then > iptables. > > Although it should be very safe to change the default backend from > iptables to nftables, that change is left for a later patch, to show > how the change in default can be undone if someone really needs to do > that. > > Signed-off-by: Laine Stump <laine@xxxxxxxxxx> > --- > meson.build | 5 + > meson_options.txt | 1 + > po/POTFILES | 1 + > src/network/bridge_driver_conf.c | 11 +- > src/network/bridge_driver_linux.c | 17 +- > src/network/meson.build | 1 + > src/network/network.conf.in | 21 +- > src/network/network_nftables.c | 940 ++++++++++++++++++++++++++++++ > src/network/network_nftables.h | 28 + > src/util/virfirewall.c | 167 +++++- > src/util/virfirewall.h | 2 + > 11 files changed, 1188 insertions(+), 6 deletions(-) > create mode 100644 src/network/network_nftables.c > create mode 100644 src/network/network_nftables.h Reviewed-by: Daniel P. Berrangé <berrange@xxxxxxxxxx> And for that matter Tested-by: Daniel P. Berrangé <berrange@xxxxxxxxxx> With regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|