On Thu, Jan 30, 2025 at 01:20:40PM -0500, Laine Stump wrote: > On 1/30/25 10:48 AM, Andrea Bolognani wrote: > > On Wed, Jan 29, 2025 at 12:24:47PM -0500, Laine Stump wrote: > > > On 1/29/25 8:39 AM, oza.4h07@xxxxxxxxx wrote: > > > (BTW, if your distro has libvirt 10.4.0 or newer, you can tell it to use > > > nftables rules rather than iptables - just add: > > > > > > firewall_backend = "nftables" > > > > > > to /etc/libvirt/network.conf) > > > > Debian 12 doesn't come with a new enough libvirt version anyway, but > > FYI a few months back I switched the default backend in Debian to > > nftables (matching Fedora) only to walk back the decision after > > getting several reports of it breaking software that's just too > > popular to ignore. See [1] for more details. > > > > I don't expect that Debian will be able to move off the iptables > > backend any time soon, at least when it comes to the default. > > Changing the backend on a per-system basis is of course totally > > possible, as long as you understand the caveats. > > > > > > [1] https://bugs.debian.org/1090355 > > Sigh. > > In the days of iptables, there were 3 main tables (filter, nat, and mangle) > and everybody's rules went into those same 3 tables. Within that single > table, if a packet reached a REJECT or DROP rule before an ACCEPT rule (or > the end of the table) then the packet would be dropped, but if it reached an > ACCEPT rule first, then it would never see the REJECT rule, and be accepted. > > But with nftables, there are an infinite number of "base tables", all > traffic is evaluated against *all* tables *all the way* to either > accept/reject in *all* cases, and it must get to the end of *every single > table* without encountering a reject rule in order for the traffic to be > accepted - there is no "early exit on accept" that skips all the rest of the > tables if the traffic is accepted by one table. > > (yeah, I know there's a lot of words enclosed in *..* there) > > So the reason that traffic is flowing with libvirt's iptables backend + > docker/whatever is that libvirt loads its rules *last* and so it can > override the "other guy's" "REJECT iptables rulesby inserting its own ACCEPT > rules at the beginning of the chain - the docket/whatever rules are never > even encountered. Also note that this only works if libvirt is very careful with its own rules to not REJECT traffic that other tools want to allow. We try to be good in this regard by tightly scoping our rules to subnets/NICs that we are responsible for. > But when libvirt uses the nftables backend, it creates its own base level > table (just as firewalld does). If docker/whatever is still using iptables, > its iptables rules are converted into nftables rules and added to a table > named "filter". So now each packet is processed through the libvirt table up > until it reaches a resolution, and then it's processed through the entire > filter table until it reaches a resolution - if either of the tables leads > to a reject result, then the traffic is dropped. The only way around this is > to add mirrored ACCEPT rules to the "filter" table (ie where all the > iptables-converted-to-nftables rules are located) that match the traffic > libvirt wants to accept. If we do this, then we're just using iptables > again, which is what we're trying to *get away from*. > > And this isn't a temporary issue caused by docker/whatever remining on > iptables - once they modernize and begin using nftables, it will be the same > situation, except instead of the "other" table being "filter", it will be a > different table created just for docker/whatever (e.g. called "docker") and > we'll be back at the same problem. > > (Note that this exact same problem will occur if, for example, someone > installs docker on a system with firewalld or ufw or whatever. If you don't > see problems in these cases, then its because one of the two packages is > adding in extra rules to the other package's table to accept the traffic > they want accepted) > > There is no generic way to fix this problem. The problem is inherant in using the low level firewall features directly. IMHO the "correct" way for everything to play nice is for both docker and libvirt to exclusively use higher level APIs from firewalld. firewalld will thus have everything in the same base tables and all will be good.... ...except this means we have to create backends for *every* distinct firewall tool that might exist. This sucks, but honestly that's the only general solution to the priorization problem. > libvirt can't possible find > every possible firewall system and add rules to the table of every single > one that passes traffic from libvirt guests. I guess the best we can > theoretically do is make a list of "supported firewall enemies" and add in > extra stuff just like we currently do with firewalld - 1) attempt to > autodetect if that enemy package is installed and active and then 2) add > whatever rules are necessary to the enemy package's table (or the "filter" > table if the enemy is still using iptables) in order to get our traffic > through) Our extra injection of rules to "workaround" our "enemies" is at best a hack that will break at some point. Directly interacting with each firewall tool is the only sustainable option from a functional POV. Direct use of 'nftables' should be only a fallback for scenarios where we don't support a given FW tool, until such time as native support is added. > So what do you consider libvirt could do to make it acceptable to have > nftables as the default backend on debian? Automatically add rules for the > current state of what docker and ufw do? Or is there some other slightly > more obscure package that we also need to compensate for before nftables > backend is acceptable as default? (seriously, let's declare an enumerated > list and then (hopefully, time permitting) take action on it. I would love > to completely eliminate the iptables backend if I possibly could, and that > certainly can't happen if some systems still have it as the default. We need native support for 'ufw' and 'firewalld', and our "default" behaviour would be to auto-detect which backend to prefer. No single backend will ever work correctly, not even the legacy 'iptables' backend, we're just shuffling which subset of users is affected by brokenness. With regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|