On Tue, Oct 29, 2024 at 06:03:26AM -0500, Andrea Bolognani wrote: > On Mon, Oct 28, 2024 at 06:07:14PM +0000, Daniel P. Berrangé wrote: > > On Mon, Oct 28, 2024 at 10:32:55AM -0700, Andrea Bolognani wrote: > > > I did some testing of my own and I can confirm that FreeBSD and > > > OpenBSD are fine with this change, as are various Linux flavors > > > (Alpine, CirrOS, Debian, Fedora, openSUSE, Ubuntu). > > > > > > However, a few other operating systems aren't: namely GNU/Hurd, Haiku > > > and NetBSD break with this change. Interestingly, these were all fine > > > with the nftables backend before it. > > > > Well that's odd. I've checked NetBSD source code and found no less > > than 3 DHCP client impls, and all of them cope with checksum == 0. > > > > https://github.com/NetBSD/src/blob/trunk/usr.bin/rump_dhcpclient/net.c#L497 > > > > https://github.com/NetBSD/src/blob/trunk/external/bsd/dhcpcd/dist/src/dhcp.c#L3507 > > > > https://github.com/NetBSD/src/blob/trunk/external/mpl/dhcp/dist/common/packet.c#L373 > > > > the middle impl also directly copes with partial checksums > > The boot log contains > > Starting dhcpcd. > wm0: checksum failure from 192.168.124.1 > > so I guess the second implementation is the relevant one. Yes, we can see that message here: https://github.com/NetBSD/src/blob/trunk/external/bsd/dhcpcd/dist/src/dhcp.c#L3609 if (!checksums_valid(data, &from, bpf_flags)) { logerrx("%s: checksum failure from %s", ifp->name, inet_ntoa(from)); return; } and the 'checksums_valid' method is the one I point to in the 2nd URL, which has the code to check whether the incoming checksum is zeroed: if (udp.uh_sum == 0) return true; so I'm really surprised you see a failure with this. It is as if you don't have Laine's fix present at all. > > > Not identified the Hurd/Haiku DHCP client code yet... > > I'm using Debian GNU/Hurd, so the DHCP client is the same as regular > Debian (ISC DHCP). The source can be found at > > https://deb.debian.org/debian-ports/pool-hurd-i386/main/i/isc-dhcp/ > > The version is a bit old and there's the tiniest amount of patching > compared to the Linux build, specifically: > > --- isc-dhcp-4.4.3-P1-1.1/debian/patches/bind-fix 1970-01-01 > 01:00:00.000000000 +0100 > +++ isc-dhcp-4.4.3-P1-1.1+hurd.1/debian/patches/bind-fix > 2023-02-15 15:39:49.000000000 +0100 > @@ -0,0 +1,26 @@ > +Index: isc-dhcp-4.4.3-P1-build/bind/bind-9.11.36/lib/isc/unix/socket.c > +=================================================================== > +--- isc-dhcp-4.4.3-P1-build.orig/bind/bind-9.11.36/lib/isc/unix/socket.c > ++++ isc-dhcp-4.4.3-P1-build/bind/bind-9.11.36/lib/isc/unix/socket.c > +@@ -2633,7 +2633,7 @@ opensocket(isc__socketmgr_t *manager, is > + char strbuf[ISC_STRERRORSIZE]; > + const char *err = "socket"; > + int tries = 0; > +-#if defined(USE_CMSG) || defined(SO_BSDCOMPAT) || defined(SO_NOSIGPIPE) > ++#if 1 > + int on = 1; > + #endif > + #if defined(SO_RCVBUF) > > I'm not sure whether this could be relevant to the issue at hand. That impl has the explicit check for all-zeros checksum. > To clarify, this is something that needs to be handled at the > userspace level, no kernel changes required? And clearly it affects > DHCP, but what about other protocols? Are we confident those will > cope just fine? It would affect *any* application which is reading raw packets and manually verifying the IP checksum. DHCP is the common case, but there could be others. Historically our iptables rule only ever fixed up DHCP packets and we've not seen other complaints. So if something else is affected in the real world, it is sufficiently rare that the few people affected have not noticed and/or cared enough to escalate it. With regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|