Re: [PATCH v2] network: add rule to nftables backend that zeroes checksum of DHCP responses

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Oct 25, 2024 at 12:18:14AM -0400, Laine Stump wrote:
> Many long years ago (April 2010), soon after "vhost" in-kernel packet
> processing was added to the virtio-net driver, people running RHEL5
> virtual machines with a virtio-net interface connected via a libvirt
> virtual network noticed that when vhost packet processing was enabled,
> their VMs could no longer get an IP address via DHCP - the guest was
> ignoring the DHCP response packets sent by the host.
> 
> (I've been informed by danpb that the same issue had been encountered,
> and "fixed" even earlier than that, in 2006, with Xen as the
> hypervisor.)
> 
> The "gory details" of the 2010 discussion are chronicled here:
> 
>   https://lists.isc.org/pipermail/dhcp-hackers/2010-April/001835.html
> 
> but basically it was because the checksum of packets wasn't being
> fully computed on the host side (because QEMU on the host and the NIC
> driver in the guest had agreed between themselves to turn off
> checksums because they were unnecessary due to the "link" between the
> two being entirely in local memory and not a physical cable), but
> 
> 1) a partial checksum had been put into the packets at some point by
>    "someone"
> 
> 2) the "don't use checksums" info was known by the guest kernel, which
>    would properly ignore the "bad" checksum), and
> 
> 3) the packets were being read by the dhclient application on the
>    guest side with a "raw" socket (thus bypassing the guest kernel UDP
>    processing that would have known the checksum was irrelevant)),
> 
> The "fix" for this ended up being two-tiered:
> 
> 1) The ISC DHCP package (which contains the aforementioned dhclient
> program) made a fix to their dhclient code which caused it to accept
> packets anyway even if they didn't have a proper checksum (NB: that's
> not a full explanation, and possibly not accurate). This fixed the
> problem for guests with an updated dhclient. Here is the code with the
> fix to ISC DHCP:
> 
>   https://github.com/isc-projects/dhcp/blob/master/common/packet.c#L365
> 
> This fixed the issue for any new/updated guests that had the fixed
> dhclient, but it didn't solve the problem for existing/old guest
> images that didn't/couldn't get their dhclient updated. This brings us
> to:
> 
> 2) iptables added a new "CHECKSUM" target and "--checksum-fill"
> action:
> 
>   http://patchwork.ozlabs.org/patch/58525/
> 
> and libvirt added an iptables rule for each virtual network to match
> DHCP response packets and perform --checksum-fill. This way by the
> time dhclient on the guest reads the raw packet, the checksum would be
> corrected, and the packet would be accepted. This was pushed upstream
> in libvirt commit v0.8.2-142-gfd5b15ff1a.
> 
> The word at the time from those more knowledgeable than me was that
> the bad checksum problem was really specific to ISC's dhclient running
> on Linux, and so once their fix was in use everywhere dhclient was
> used, bad checksums would be a thing of the past and the
> --checksum-fill iptables rules would no longer be needed (but would
> otherwise be harmless if they were still there). (Plot twist: the
> dhclient code in fix (1) above apparently is on a Linux-only code path
> - this is very important later!)
> 
> Based on this information (and also due to the opinion that fixing it
> by having iptables modify the packet checksum was really the wrong way
> to permanently fix things, i.e. an "ugly hack"), the nftables
> developers made the decision to not implement an equivalent to
> --checksum-fill in nftables. As a result, when I wrote the nftables
> firewall backend for libvirt virtual networks earlier this year, it
> didn't add in any rule to "fix" broken UDP checksums (since there was
> apparently no equivalent in nftables and, after all, that was fixed
> somewhere else 14 years ago, right???)
> 
> But last week, when Rich Jones was doing routine testing using a Fedora
> 40 host (the first Fedora release to use the nftables backend of libvirt's
> network driver by default) and a FreeBSD guest, for "some strange
> reason", the FreeBSD guest was unable to get an IP address from DHCP!!
> 
>   https://www.spinics.net/linux/fedora/libvirt-users/msg14356.html
> 
> A few quick tests proved that it was the same old "bad checksum"
> problem from 2010 come back to haunt us - it wasn't a Linux-only issue
> after all.
> 
> Phil Sutter and Eric Garver (nftables people) pointed out that, while
> nftables doesn't have an action that will *compute* the checksum of a
> packet, it *does* have an action that will set the checksum to 0, and
> suggested we try adding a "zero the checksum" rule for dhcp response
> packets to our nftables ruleset. (Why? Because a checksum value of 0
> in a IPv4 UDP packet is defined by RFC768 to mean "no checksum
> generated", implying "checksum not needed").  It turns out that this
> works - dhclient properly recognizes that a 0 checksum means "don't
> bother with the checksum", and accepts the packet as valid.
> 
> So to once again fix this timeless bug, this patch adds such a
> checksum zeroing rule to the nftables rules setup for each virtual
> network.
> 
> This has been verified (on a Fedora 40 host) to fix DHCP with FreeBSD
> guests, while not breaking it for Fedora or Windows (10) guests.

You can add OpenBSD to that list, as I tested that too.

> 
> Fixes: b89c4991daa0ee9371f10937fab3b03c5ffdabc6
> Reported-by: Rich Jones <rjones@xxxxxxxxxx>
> Fix-Suggested-by: Eric Garver <egarver@xxxxxxxxxx>
> Fix-Suggested-by: Phil Sutter <psutter@xxxxxxxxxx>
> Signed-off-by: Laine Stump <laine@xxxxxxxxxx>
> Reviewed-by: Daniel P. Berrangé <berrange@xxxxxxxxxx>
> ---
> 
> (NB: jdenemar reminded me this is a bugfix, so we don't need to rush
> to get it pushed before he freezes for the RC1 snapshot. I'm happy
> with it as is though, if anyone is awake early enough and wants to
> push it before he snapshots RC1. Otherwise I'll just push it later and
> it will be in RC2)
> 
> Changes from V1:
> 
> * informational errors/omissions in commit log message fixed
> 
>  src/network/network_nftables.c                | 69 +++++++++++++++++++
>  tests/networkxml2firewalldata/base.nftables   | 14 ++++
>  .../forward-dev-linux.nftables                | 16 +++++
>  .../isolated-linux.nftables                   | 16 +++++
>  .../nat-default-linux.nftables                | 16 +++++
>  .../nat-ipv6-linux.nftables                   | 16 +++++
>  .../nat-ipv6-masquerade-linux.nftables        | 16 +++++
>  .../nat-many-ips-linux.nftables               | 16 +++++
>  .../nat-port-range-ipv6-linux.nftables        | 16 +++++
>  .../nat-port-range-linux.nftables             | 16 +++++
>  .../nat-tftp-linux.nftables                   | 16 +++++
>  .../route-default-linux.nftables              | 16 +++++
>  12 files changed, 243 insertions(+)

Reviewed-by: Daniel P. Berrangé <berrange@xxxxxxxxxx>


With regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|




[Index of Archives]     [Virt Tools]     [Libvirt Users]     [Lib OS Info]     [Fedora Users]     [Fedora Desktop]     [Fedora SELinux]     [Big List of Linux Books]     [Yosemite News]     [KDE Users]     [Fedora Tools]

  Powered by Linux