On Tue, Jul 13, 2010 at 02:17:17AM -0400, Laine Stump wrote: > From: Laine Stump <laine@xxxxxxxxxx> > > (I don't know whether or not we want to commit this upstream yet - the > proposed iptables and kernel module backend for the changes have been > posted but not yet committed upstream. On the other hand, the new > libvirt code ends up simply printing a warning message if the > necessary iptables support isn't yet in place.) Well in general we try to avoid this kind of preventive code change as we're not 100% sure the support will ever make it unchanged in upstream for dependancies. But that should not stop us from reviewing the patches ! > This patch attempts to take advantage of a newly added netfilter > module to correct for a problem with some guest DHCP client > implementations when used in conjunction with a DHCP server run on the > host systems with packet checksum offloading enabled. > > The problem is that, when the guest uses a RAW socket to read the DHCP > response packets, the checksum hasn't yet been fixed by the IP stack, > so it is incorrect. > > The fix implemented here is to add a rule to the POSTROUTING chain of > the mangle table in iptables that fixes up the checksum for packets on > the virtual network's bridge that are destined for the bootpc port (ie > "dhcpc", ie port 68) port on the guest. > > Only very new versions of iptables will have this support (it has been > submitted upstream, but not yet committed), so a failure to add this > rule only results in a warning message. The iptables patch is here: > > http://patchwork.ozlabs.org/patch/58525/ > > A corresponding kernel module patch is also required (the backend of > the iptables patch) and has been submitted, but I don't have the > details for that (I tested using a pre-built image I received from the > developer, Michael Tsirkin). > --- > src/libvirt_private.syms | 2 + > src/network/bridge_driver.c | 17 ++++++++++ > src/util/iptables.c | 71 +++++++++++++++++++++++++++++++++++++++++++ > src/util/iptables.h | 6 ++++ > 4 files changed, 96 insertions(+), 0 deletions(-) > > diff --git a/src/libvirt_private.syms b/src/libvirt_private.syms > index 778ceb1..d81d4cf 100644 > --- a/src/libvirt_private.syms > +++ b/src/libvirt_private.syms > @@ -328,6 +328,7 @@ iptablesAddForwardAllowRelatedIn; > iptablesAddForwardMasquerade; > iptablesAddForwardRejectIn; > iptablesAddForwardRejectOut; > +iptablesAddOutputFixUdpChecksum; > iptablesAddTcpInput; > iptablesAddUdpInput; > iptablesContextFree; > @@ -339,6 +340,7 @@ iptablesRemoveForwardAllowRelatedIn; > iptablesRemoveForwardMasquerade; > iptablesRemoveForwardRejectIn; > iptablesRemoveForwardRejectOut; > +iptablesRemoveOutputFixUdpChecksum; > iptablesRemoveTcpInput; > iptablesRemoveUdpInput; > > diff --git a/src/network/bridge_driver.c b/src/network/bridge_driver.c > index 72255c1..c4480ff 100644 > --- a/src/network/bridge_driver.c > +++ b/src/network/bridge_driver.c > @@ -781,6 +781,19 @@ networkAddIptablesRules(struct network_driver *driver, > !networkAddRoutingIptablesRules(driver, network)) > goto err8; > > + /* If we are doing local DHCP service on this network, attempt to > + * add a rule that will fixup the checksum of DHCP response > + * packets back to the guests (but report failure without > + * aborting, since not all iptables implementations support it). > + */ > + > + if ((network->def->ipAddress || network->def->nranges) && > + (iptablesAddOutputFixUdpChecksum(driver->iptables, > + network->def->bridge, 68) != 0)) { > + VIR_WARN("Could not add rule to fixup DHCP response checksums " > + "on network '%s'", network->def->name); > + } > + > return 1; > > err8: > @@ -811,6 +824,10 @@ networkAddIptablesRules(struct network_driver *driver, > static void > networkRemoveIptablesRules(struct network_driver *driver, > virNetworkObjPtr network) { > + if (network->def->ipAddress || network->def->nranges) { > + iptablesRemoveOutputFixUdpChecksum(driver->iptables, > + network->def->bridge, 68); > + } hum, there is no return code to check, if yes the block isn't necessary. Also a empty line should be added to separate from next if block, but it's purely cosmetic :-) > if (network->def->forwardType != VIR_NETWORK_FORWARD_NONE) { > if (network->def->forwardType == VIR_NETWORK_FORWARD_NAT) { > iptablesRemoveForwardMasquerade(driver->iptables, > diff --git a/src/util/iptables.c b/src/util/iptables.c > index d06b857..9b888e5 100644 > --- a/src/util/iptables.c > +++ b/src/util/iptables.c > @@ -60,6 +60,7 @@ struct _iptablesContext > iptRules *input_filter; > iptRules *forward_filter; > iptRules *nat_postrouting; > + iptRules *mangle_postrouting; > }; > > static void > @@ -188,6 +189,9 @@ iptablesContextNew(void) > if (!(ctx->nat_postrouting = iptRulesNew("nat", "POSTROUTING"))) > goto error; > > + if (!(ctx->mangle_postrouting = iptRulesNew("mangle", "POSTROUTING"))) > + goto error; > + > return ctx; > > error: > @@ -210,6 +214,8 @@ iptablesContextFree(iptablesContext *ctx) > iptRulesFree(ctx->forward_filter); > if (ctx->nat_postrouting) > iptRulesFree(ctx->nat_postrouting); > + if (ctx->mangle_postrouting) > + iptRulesFree(ctx->mangle_postrouting); > VIR_FREE(ctx); > } > > @@ -753,3 +759,68 @@ iptablesRemoveForwardMasquerade(iptablesContext *ctx, > { > return iptablesForwardMasquerade(ctx, network, physdev, REMOVE); > } > + > + > +static int > +iptablesOutputFixUdpChecksum(iptablesContext *ctx, > + const char *iface, > + int port, > + int action) > +{ > + char portstr[32]; > + > + snprintf(portstr, sizeof(portstr), "%d", port); > + portstr[sizeof(portstr) - 1] = '\0'; > + > + return iptablesAddRemoveRule(ctx->mangle_postrouting, > + action, > + "--out-interface", iface, > + "--protocol", "udp", > + "--destination-port", portstr, > + "--jump", "CHECKSUM", "--checksum-fill", > + NULL); > +} > + > +/** > + * iptablesAddOutputFixUdpChecksum: > + * @ctx: pointer to the IP table context > + * @iface: the interface name > + * @port: the UDP port to match > + * > + * Add an rule to the mangle table's POSTROUTING chain that fixes up the > + * checksum of packets with the given destination @port. > + * the given @iface interface for TCP packets. > + * > + * Returns 0 in case of success or an error code in case of error. > + * (NB: if the system's iptables does not support checksum mangling, > + * this will return an error, which should be ignored.) > + */ > + > +int > +iptablesAddOutputFixUdpChecksum(iptablesContext *ctx, > + const char *iface, > + int port) > +{ > + return iptablesOutputFixUdpChecksum(ctx, iface, port, ADD); > +} > + > +/** > + * iptablesRemoveOutputFixUdpChecksum: > + * @ctx: pointer to the IP table context > + * @iface: the interface name > + * @port: the UDP port of the rule to remove > + * > + * Removes the checksum fixup rule that was previous added with > + * iptablesAddOutputFixUdpChecksum. > + * > + * Returns 0 in case of success or an error code in case of error > + * (again, if iptables doesn't support checksum fixup, this will > + * return an error, which should be ignored) > + */ > +int > +iptablesRemoveOutputFixUdpChecksum(iptablesContext *ctx, > + const char *iface, > + int port) > +{ > + return iptablesOutputFixUdpChecksum(ctx, iface, port, REMOVE); > +} > diff --git a/src/util/iptables.h b/src/util/iptables.h > index 7d55a6d..5c2e553 100644 > --- a/src/util/iptables.h > +++ b/src/util/iptables.h > @@ -89,5 +89,11 @@ int iptablesAddForwardMasquerade (iptablesContext *ctx, > int iptablesRemoveForwardMasquerade (iptablesContext *ctx, > const char *network, > const char *physdev); > +int iptablesAddOutputFixUdpChecksum (iptablesContext *ctx, > + const char *iface, > + int port); > +int iptablesRemoveOutputFixUdpChecksum (iptablesContext *ctx, > + const char *iface, > + int port); > > #endif /* __QEMUD_IPTABLES_H__ */ I appreciate the level of comments :-) Patch looks fine but I would wait a bit until support is upstream, once it is, ACK ! Daniel -- Daniel Veillard | libxml Gnome XML XSLT toolkit http://xmlsoft.org/ daniel@xxxxxxxxxxxx | Rpmfind RPM search engine http://rpmfind.net/ http://veillard.com/ | virtualization library http://libvirt.org/ -- libvir-list mailing list libvir-list@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/libvir-list