On Tue, Apr 23, 2024 at 01:27:05PM -0400, Laine Stump wrote: > On 4/23/24 7:15 AM, Daniel P. Berrangé wrote: > > On Sun, Apr 21, 2024 at 10:53:32PM -0400, Laine Stump wrote: > > > Support using nftables to setup the firewall for each virtual network, > > > rather than iptables. The initial implementation of the nftables > > > backend creates (almost) exactly the same ruleset as the iptables > > > backend, determined by running the following commands on a host that > > > has an active virtual network: > > > > > > iptables-save >iptables.txt > > > iptables-restore-translate -f iptables.txt > > > > > > (and the similar ip6tables-save/ip6tables-restore-translate for an > > > IPv6 network). Correctness of the new backend was checked by comparing > > > the output of: > > > > > > nft list ruleset > > > > > > when the backend is set to iptables and when it is set to nftables. > > > > > > This page was used as a guide: > > > > > > https://wiki.nftables.org/wiki-nftables/index.php/Moving_from_iptables_to_nftables > > > > > > The only differences between the rules created by the nftables backed > > > vs. the iptables backend (aside from a few inconsequential changes in > > > display order of some chains/options) are: > > > > > > 1) When we add nftables rules, rather than adding them in the > > > system-created "filter" and "nat" tables, we add them in a private > > > table (ie only we should be using it) created by us called "libvirt" > > > (the system-created "filter" and "nat" tables can't be used because > > > adding any rules to those tables directly with nft will cause failure > > > of any legacy application attempting to use iptables when it tries to > > > list the iptables rules (e.g. "iptables -S"). > > > > > > (NB: in nftables only a single table is required for both nat and > > > filter rules - the chains for each are differentiated by specifying > > > different "hook" locations for the toplevel chain of each) > > > > > > 2) nftables doesn't support the "checksum mangle" rule (or any > > > equivalent functionality) that we have historically added to our > > > iptables rules, so the nftables rules we add have nothing related to > > > checksum mangling. > > > > > > (NB: The result of (2) is that if you a) have a very old guest (RHEL5 > > > era or earlier) and b) that guest is using a virtio-net network > > > device, and c) the virtio-net device is using vhost packet processing > > > (the default) then DHCP on the guest will fail. You can work around > > > this by adding <driver name='qemu'/> to the <interface> XML for the > > > guest). > > > > > > There are certainly much better nftables rulesets that could be used > > > instead of those implemented here, and everything is in place to make > > > future changes to the rules that are used simple and free of surprises > > > (e.g. the rules that are added have coresponding "removal" commands > > > added to the network status so that we will always remove exactly the > > > rules that were previously added rather than trying to remove the > > > rules that "this build of libvirt would have added" (which will be > > > incorrect the first time we run a libvirt with a newly modified > > > ruleset). For this initial implementation though, I wanted the > > > nftables rules to be as identical to the iptables rules as possible, > > > just to make it easier to verify that everything is working. > > > > > > The backend can be manually chosen using the firewall_backend setting > > > in /etc/libvirt/network.conf. libvirtd/virtnetworkd will read this > > > setting when it starts; if there is no explicit setting, it will look > > > for iptables commands on the host and, if they are found, will select > > > the iptables backend (this is the default for the sake of 100% > > > backward compatibility), but if iptables commands aren't found, then > > > it will use the nftables backend. > > > > > > (Since libvirt will automatically remove all its previous filter rules > > > and re-add rules using the current backend setting for all active > > > networks when it starts up, and the only noticeable change in behavior > > > between the iptables and nftables backends is that noted in item (2) > > > above, we could instead decide to make nftables the default backend > > > rather than iptables - it all depends on how important it is to work > > > properly on 15 year old guest OSes using DHCP with virtio-net > > > interfaces). > > > > > > Signed-off-by: Laine Stump <laine@xxxxxxxxxx> > > > --- > > > po/POTFILES | 1 + > > > src/network/bridge_driver_conf.c | 3 + > > > src/network/bridge_driver_linux.c | 18 +- > > > src/network/meson.build | 1 + > > > src/network/network.conf | 17 +- > > > src/network/network_nftables.c | 940 ++++++++++++++++++++++++++++++ > > > src/network/network_nftables.h | 28 + > > > src/util/virfirewall.c | 169 +++++- > > > src/util/virfirewall.h | 2 + > > > 9 files changed, 1174 insertions(+), 5 deletions(-) > > > create mode 100644 src/network/network_nftables.c > > > create mode 100644 src/network/network_nftables.h > > > > > > > > + if (needRollback) { > > > + virFirewallCmd *rollback = virFirewallAddRollbackCmd(firewall, fwCmd->layer, NULL); > > > + const char *handleStart = NULL; > > > + size_t handleLen = 0; > > > + g_autofree char *handleStr = NULL; > > > + g_autofree char *rollbackStr = NULL; > > > + > > > + /* Search for "# handle n" in stdout of the nft add command - > > > + * that is the handle of the table/rule/chain that will later > > > + * need to be deleted. > > > + */ > > > > What are the uniqueness guarantees of handle numbers. > > Each table has a monotonically increasing counter (I'd assume at least 32 > bits), so it shouldn't be a problem. > > But WAIT!!! - While typing this reply I've discovered something new! > > Until about 45 minutes ago, I had assumed there was a single systemwide > counter. But based on your question, I asked egarver who told me that there > is a counter for each table. And we create our own tables (called "libvirt", > for both ip and ip6). I just tried manually deleting the libvirt table, and > in that case the counter does start over from 0! :-O Oh, that's not terrible at all, as the unique constraint is thus ("libvirt", <handle>) which eliminates any risk of us accidentally deleting stuff belonging to the sysadmin or another app. If someone else creates a table called 'libvirt' they get to keep all the broken pieces :-) > I can't decide if this is a case of "Ooh! We'd better try to protect against > this!", or "Well, you deliberately broke it, so you get to pick up the > pieces!" The latter. With regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :| _______________________________________________ Devel mailing list -- devel@xxxxxxxxxxxxxxxxx To unsubscribe send an email to devel-leave@xxxxxxxxxxxxxxxxx