There has been a problem for several years with libvirt's default network conflicting with the host network connection on new installs, particularly when the "host" is actually virtual machine that is, itself, connected to the libvirt default network on its respective host. If the two default networks use the same subnet, and if the nested host's libvirtd happens to start its network before the system networking connects to the toplevel host, then network connectivity to the guest is just silently non-working. We've tried several things over the years to eliminate this problem, including: 1) Checking for conflicting routes/interfaces during installation of the libvirt-daemon-config-network package (the one containing the default network config file) which tries different subnets until it finds one that doesn't conflict. This helps in some cases, but fails on 2 points: a) if the installation environment is different from the environment where the system is actually run (e.g. a "live CD" image of a distro, which is built in a container by the distro maintainers, then could be run in any number of places, and b) it modifies the installed files during the rpm installation post_install script, which is now discouraged because people building container images don't like dealing with that. 2) When starting a libvirt network, we now check for any route or interface that conflicts with the new network's IP's and routes. This doesn't fix the problem, but does notify the user of the problem *as long as libvirt starts its networks after the host has started its system networks*. Neither of these help in the situation where libvirt's networks are started first. The script in this patch uses a NetworkManager facility (dispatcher.d scripts, which are run whenever any interface is brought up or down) to check for any libvirt networks that conflict with the new NetworkManager interface, and if it finds a conflict it logs a message and destroys the libvirt network. So as with case (2) above, it doesn't solve the problem, but it does make it more obvious, and also makes sure the host networking isn't borked, so you can still ssh into the machine and fix it. There are a few caveats/misgivings with the script I've come up with, which cause me to post it as just an RFC: 1) It's written in python, and uses libxml2 When I first sat down to make this script (following a suggestion by danpb during an IRC discussion), I immediately put #!/bin/bash at the top of the file, because it's supposed to be a script. But then I remembered that we're trying to narrow down the usage of languages in libvirt, and anyway it would be nice to be able to properly parse the XML to get the IP addresses/netmasks/prefixes. So I instead wrote it in python. This makes for a less ugly program, but it does mean that the daemon-config-network package is now dependent on python3-libxml (and python3 of course), which pulls in a bunch of other packages. Everybody's dev systems already have these packages and their dependencies installed (since both are required to build) and many other users systems already have them (they are required by virt-install and virt-manager, among others), including the Fedora live CD, for example. But not *all* systems have them. There has been a lot of work to reduce package bloat caused by pulling in more and more packages, so I'm reluctant to contribute to that. On the other hand, someone who is looking to minimize their system footprint will also probably not be installing the libvirt default network, so maybe it's acceptable (I'm leaning towards "not", but want to know if anyone else has a different opinion) 2) As delivered, it checks for conflicts with *all* libvirt networks. While that was fun to write, and in theory is the right thing to do, in practice it may be / probably is overkill. As we discussed in IRC, the main time when this is a problem is just after the first boot of a new OS / libvirt installation, and the only libvirt network that's going to exist at that time is the default network. So while having a loop that scans all libvirt networks is more complete, in almost all cases it is just wasting time. (I did provide a bool at the top of the script that can be modified to tell it to only check the default network, but of course the script is installed in /usr/lib, and it's not proper to modify files in /usr/lib, so that isn't a real-world solution, but more a way to allow people (i.e. "me") right now to test out the different behaviors. 3) This only works if NetworkManager is enabled. I brought this up when danpb first mentioned the idea, and he rightly pointed out that every report of this problem we've had has been from the first boot of a new install on a system that used Network Manager - anybody using any other method of networking config on their host has had to manually intervene to get it setup, but NM tries to do everything auto-magically. So while there may still be a very occasional rare occurence of a silent networking failure even with this script installed, this should eliminate 99% of the cases. So, the questions I have are: 1) Do we want to allow adding the dependency on python3-libxml to the libvirt-daemon-config-network package in order to avoid adding a bash script. Or should I just redo it in bash 2) Do we want something that checks all active networks as this does, or just the default network (and if the latter, should we make it easily modifiable to check all networks, or just strip it down as much as possible) 3) I could eliminate the libxml2 dependency by just grepping the xml for "<address .*ip=". Is reduced correctness (which is probably 0% different in practice) worth the reduced package dependencies? As a single question: on a sliding scale of "this script" -----> "simple bash script" where do we want to land? Signed-off-by: Laine Stump <laine@xxxxxxxxxx> --- libvirt.spec.in | 2 + src/network/Makefile.inc.am | 8 +- src/network/nm-dispatcher-check-nets.py | 182 ++++++++++++++++++++++++ 3 files changed, 191 insertions(+), 1 deletion(-) create mode 100755 src/network/nm-dispatcher-check-nets.py diff --git a/libvirt.spec.in b/libvirt.spec.in index 6abf97df85..e40068b2be 100644 --- a/libvirt.spec.in +++ b/libvirt.spec.in @@ -496,6 +496,7 @@ Requires: libvirt-libs = %{version}-%{release} Requires: dnsmasq >= 2.41 Requires: radvd Requires: iptables +Requires: python3-libxml2 %description daemon-driver-network The network driver plugin for the libvirtd daemon, providing @@ -1630,6 +1631,7 @@ exit 0 %dir %attr(0755, root, root) %{_localstatedir}/lib/libvirt/dnsmasq/ %attr(0755, root, root) %{_libexecdir}/libvirt_leaseshelper %{_libdir}/%{name}/connection-driver/libvirt_driver_network.so +%{_prefix}/lib/NetworkManager/dispatcher.d/50-libvirt-check-nets %if %{with_firewalld_zone} %{_prefix}/lib/firewalld/zones/libvirt.xml diff --git a/src/network/Makefile.inc.am b/src/network/Makefile.inc.am index 196a30e16c..d58e09f88b 100644 --- a/src/network/Makefile.inc.am +++ b/src/network/Makefile.inc.am @@ -170,6 +170,9 @@ install-data-network: ( cd $(DESTDIR)$(confdir)/qemu/networks/autostart && \ rm -f default.xml && \ $(LN_S) ../default.xml default.xml ) + $(MKDIR_P) "$(DESTDIR)$(prefix)/lib/NetworkManager/dispatcher.d" + $(INSTALL_DATA) $(srcdir)/network/nm-dispatcher-check-nets.py \ + $(DESTDIR)$(prefix)/lib/NetworkManager/dispatcher.d/50-libvirt-check-nets if WITH_FIREWALLD_ZONE $(MKDIR_P) "$(DESTDIR)$(prefix)/lib/firewalld/zones" $(INSTALL_DATA) $(srcdir)/network/libvirt.zone \ @@ -189,7 +192,10 @@ endif WITH_FIREWALLD_ZONE endif WITH_NETWORK -EXTRA_DIST += network/default.xml network/libvirt.zone +EXTRA_DIST += \ + network/default.xml \ + network/nm-dispatcher-check-nets.py \ + network/libvirt.zone .PHONY: \ install-data-network \ diff --git a/src/network/nm-dispatcher-check-nets.py b/src/network/nm-dispatcher-check-nets.py new file mode 100755 index 0000000000..9225a08ea1 --- /dev/null +++ b/src/network/nm-dispatcher-check-nets.py @@ -0,0 +1,182 @@ +#!/usr/bin/env python3 +# +# Copyright (C) 2012-2019 Red Hat, Inc. +# +# This library is free software; you can redistribute it and/or +# modify it under the terms of the GNU Lesser General Public +# License as published by the Free Software Foundation; either +# version 2.1 of the License, or (at your option) any later version. +# +# This library is distributed in the hope that it will be useful, +# but WITHOUT ANY WARRANTY; without even the implied warranty of +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU +# Lesser General Public License for more details. +# +# You should have received a copy of the GNU Lesser General Public +# License along with this library. If not, see +# <http://www.gnu.org/licenses/>. + +import libvirt +import sys +import os +import libxml2 +from ipaddress import ip_network + +# This script should be installed in +# /usr/lib/NetworkManager/dispatcher.d/50-libvirt-check-nets. It will be +# called by NetworkManager every time a network interface is taken up +# or down. When a new network comes up, it checks the libvirt virtual +# networks to see if their IP address(es) (including any static +# routes) are in conflict with the IP address(es) (or static routes) +# of the newly added interface. If so, the libvirt network is +# disabled. It is assumed that the user will notice that their guests +# no longer have network connectvity (and/or the message logged by +# this script), see that the network has been disabled, and then +# realize the conflict when they try to restart it. +# +# set checkDefaultOnly=False to check *all* active libvirt networks +# for conflicts with the new interface. Set to True to check only the +# libvirt default network (since most networks other than the default +# network are added post-install at a time when all of the hosts other +# networks are already active, it may be overkill to check all of the +# libvirt networks for conflict here (and instead just add more +# needless overheard to bringing up a new host interface). +# +checkDefaultOnly = False + +# NB: since this file is installed in /usr/lib, it really shouldn't be +# modified by the user, but instead should be copied to +# /etc/NetworkManager/dispatcher.d, where it will override the copy in +# /usr/lib. Even that isn't a proper solution though - if we're going +# to actually have this config knob, perhaps we should check for it in +# the environment, and if someone wants to modify it they can put a +# short script in /etc that exports and environment variable and then +# calls this script? Just thinking out loud here... + +def checkconflict(conn, netname, hostnets, hostif): + + # ignore if the network has been brought down or removed since we + # got the list + try: + net = conn.networkLookupByName(netname) + except libvirt.libvirtError: + return + + if not net.isActive(): + return + + xml = net.XMLDesc() + doc = libxml2.parseDoc(xml) + ctx = doc.xpathNewContext() + + # see if NetworkManager is informing us that this libvirt network + # itself is coming online + bridge = ctx.xpathEval("/network/bridge/@name"); + if bridge and bridge[0].content == hostif: + return + + # check *all* the addresses of this network + addrs = ctx.xpathEval("/network/*[@address]") + for ip in addrs: + ctx.setContextNode(ip) + address = ctx.xpathEval("@address") + prefix = ctx.xpathEval("@prefix") + netmask = ctx.xpathEval("@netmask") + + if not (address and len(address[0].content)): + continue + + addrstr = address[0].content + if not (prefix and len(prefix[0].content)): + # check for a netmask + if not (netmask and len(netmask[0].content)): + # this element has address, but no prefix or netmask + # probably it is <mac address ...> so we can ignore it + continue + # convert netmask to prefix + prefixstr = str(ip_network("0.0.0.0/%s" % netmask[0].content).prefixlen) + else: + prefixstr = prefix[0].content + + virtnetaddress = ip_network("%s/%s" % (addrstr, prefixstr), strict = False) + # print("network %s address %s" % (netname, str(virtnetaddress))) + for hostnet in hostnets: + if virtnetaddress == hostnet: + # There is a conflict with this libvirt network and the specified + # net, so we need to disable the libvirt network + print("Conflict with host net %s when starting interface %s - bringing down libvirt network '%s'" + % (str(hostnet), hostif, netname)) + try: + net.destroy() + except libvirt.libvirtError: + print("Failed to destroy network %s" % netname) + return + return + + +def addHostNets(hostnets, countenv, addrenv): + + count = os.getenv(countenv) + if not count or count == 0: + return + + for num in range(int(count)): + addrstr = os.getenv("%s_%d" % (addrenv, num)) + if not addrstr or addrstr == "": + continue + + net = ip_network(addrstr.split()[0], strict=False) + if net: + hostnets.append(net) + return + + +############################################################ + +if sys.argv[2] != "up": + sys.exit(0) + +hostif = sys.argv[1] + +try: + conn = libvirt.open(None) +except libvirt.libvirtError: + print('Failed to open connection to the hypervisor') + sys.exit(0) + +if checkDefaultOnly: + nets = [] + net = conn.networkLookupByName("default") + if not (net and net.isActive()): + sys.exit(0) + nets.append(net) +else: + nets = conn.listAllNetworks(libvirt.VIR_CONNECT_LIST_NETWORKS_ACTIVE) + if not nets: + sys.exit(0) + +# We have at least one active network. Build a list of all network +# routes added by the new interface, and compare that list to the list +# of all networks used by each active libvirt network. If any are an +# exact match, then we have a conflict and need to shut down the +# libvirt network to avoid killing host networking. + +# When NetworkManager calls scripts in /etc/NetworkManager/dispatcher.d +# it will have all IP addresses and routes associated with the interface +# that is going up or down in the following environment variables: +# +# IP4_NUM_ADDRESSES - number of IPv4 addresses +# IP4_ADDRESS_N - one variable for each address, starting at _0 +# IP4_NUM_ROUTES - number of IPv5 routes +# IP4_ROUTE_N - one for each route, starting at _0 +# (replace "IP4" with "IP6" and repeat) +# +hostnets = [] +addHostNets(hostnets, "IP4_NUM_ADDRESSES", "IP4_ADDRESS"); +addHostNets(hostnets, "IP4_NUM_ROUTES", "IP4_ROUTE"); +addHostNets(hostnets, "IP6_NUM_ADDRESSES", "IP6_ADDRESS"); +addHostNets(hostnets, "IP6_NUM_ROUTES", "IP6_ROUTE"); + +for net in nets: + + checkconflict(conn, net.name(), hostnets, hostif) -- 2.25.4