On Mon, Oct 21, 2024 at 12:14:38AM -0400, Laine Stump wrote: > Many long years ago (April 2010), soon after "vhost" in-kernel packet > processing was added to the virtio-net driver, people running RHEL5 > virtual machines with a virtio-net interface connected via a libvirt > virtual network noticed that when vhost packet processing was enabled, > their VMs could no longer get an IP address via DHCP - the guest was > ignoring the DHCP response packets sent by the host. > > The (as danpb calls them) "gory details" of this are chronicled here: > > https://lists.isc.org/pipermail/dhcp-hackers/2010-April/001835.html > > but basically it was because the checksum of packets wasn't being > fully computed on the host side (because the host had checksum > offloading enabled and thought that it would be taken care of later, > e.g. with NIC hardware), while these packets going from a tap device > to a virtio-net NIC in a guest wouldn't get that service, and the > packets would arrive with a "bad checksum". AFAIR, it isn't actually a bug with virtio-net usage as this last bit suggests. Rather it is a result of feature negotiation with QEMU on the host, whereby the guest & QEMU mutually agree to turn off checksums because they are redundant when the "link" is just local memory not a physical cable. IOW, packets don't arrive in the guest with a bad checksum. They arrive in the guest with no checksum *as requested* by the guest. The DHCP client decides this is a bad checksum, as it wasn't aware of the checksum offload usage. > The "fix" for this ended up being that iptables added a new > "--checksum-fill" action, and libvirt added an iptables rule for each > virtual network to match DHCP response packets and perform > --checksum-fill. > > In the meantime, the ISC DHCP package (which contains the dhclient > program that had been rejecting the bad checksum packets) made a > separate fix to their dhclient which caused it to accept packets > anyway even if they didn't have a proper checksum (NB: that's not a > full explanation, and possibly not accurate). The word at the time> q from those "in the know" was that the bad checksum problem was really > specific to ISC's dhclient, and so once their fix was in use > everywhere dhclient was used, the problem would be a thing of the past > and the checksum fixup iptables rules would no longer be needed (but > would otherwise be harmless if it was still there). The fix did indeed work correctly for dhclient.... on linux ! The fix relied on a Linux specific sockets API extension, and thus wasn't applicable to non-Linux codepaths in dhclient AFAICT. > Based on this information (and also due to the opinion that fixing the > problem by having iptables modify the packet checksum was the wrong > way to fix things), the nftables developers made the decision to not > implement an equivalent to --checksum-fill in nftables. As a result, > when I wrote the nftables firewall backend for libvirt virtual > networks, it didn't add in any rule to "fix" broken UDP checksums > (after all, that was fixed somewhere else 14 years ago, right???) ....and in Fedora/RHEL context it was fixed 18 years ago, as we first hit this when working on Xen integration in 2006 :-) > A few quick tests proved that it was the same old "bad checksum" > problem from 2010 come back to haunt us. 2006 :-) > After some discussion with Phil Sutter and Eric Garver (nftables > people), they suggested that, while nftables doesn't have an action > that will *compute* the checksum of a packet, it *does* have an action > that will set the checksum to 0, and that maybe we should try > that. Then Phil tried it himself by manually adding such a rule to a > running system, and verified that it did fix the issue at least for > FreeBSD guests. > > So over the weekend I came up with a patch to add a checksum 0 rule to > the rules setup for each virtual network. This is that patch. > > I have so far verified that this patch enables FreeBSD to receive the > DHCP response and get an IP address, and that it hasn't *broken* this > functionality for a random old Fedora image I had (Fedora 27!?!?! I > really need to update my test images!!). Before pushing it I would > like to verify that zeroing the checksum of DHCP response packets > doesn't break any other guest, so I would appreciate the help of > anyone who could build and install libvirt with this patch and let me > know of both successes and failures of any guest to acquire an IP > address with DHCP. Once I've received enough positive reports (and 0 > negative reports!) then we can think about pushing this patch (and > also backporting it downstream to Fedora 40) On the one hand it is good that you test this and found it to to work. What concerns me is a lack of understanding of /why/ it works. AFAICT there is nothing in the TCP RFC documenting all-zeros as a special case for indicating absent checksums. I'd really like to know /why/ it works, so we can be confident we're relying on intentional behaviour, as opposed to a happy accident. Functionally your patch does what it claims to do, so codewise I'm happy to say Reviewed-by: Daniel P. Berrangé <berrange@xxxxxxxxxx>, but I'd rather not merge it without a deeper understandnig. With regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|