On Wed, Feb 24, 2010 at 02:34:53PM +0000, Simon Kelley wrote: > As the principal maintainer of dnsmasq, I'm seeing increasing reports of > problems on systems which run both dnsmasq and libvirt. I'm fairly sure > I understand what's going on in these cases, and I have a few proposals > for changes in libvir and dnsmasq that should fix things. Thanks for starting this topic - it would certainly be nice if we can come up with a solution that has better inter-operability & fewer surprises for administrators. > The problem is that libvirt runs a private instance of dnsmasq: on > machines which are also running a "system" dnsmasq daemon, this can > cause problems. > > Some background: dnsmasq can run in two modes. > > Default mode: dnsmasq binds the wildcard address and does network magic > to determine which interface request packets actually come from, so that > the results can be sent back with the correct source address. This has > the advantage that network interfaces can come and go and change IP > address and dnsmasq will keep working. It's possible to restrict dnsmasq > to only reply to requests on some interfaces; requests from other > interfaces will be read by dnsmasq and then silently dropped. Telling > dnsmasq to use an interface which doesn't exist but might in the future > will result in a logged warning, but dnsmasq will still start and when > the interface comes up it will work. > > Bind-interfaces mode: This is the traditional way to do UDP servers. At > startup dnsmasq enumerates all the extant interfaces and then opens a > socket for each one, listening on the interfaces's IP address. > Interfaces may be skipped if excluded by the --interface and > --except-interface flags, and any interface specified in --interface > which doesn't exist at start-up will generate a fatal error. Yep, I remember we hit that fatal error in libvirt, when we create our bridge device & then launched dnsmasq, sometimes dnsmasq would exit with an error because the bridge device wasn't visible in userspace yet. Thus we use bind-interfaces mode, but instead of using the flag --interface=virbr0, we switched to --listen-address=IP-of-VIRBR0 I imagine you've already seen this, but as an example of the ARGV that libvirt generates for its dnsmasq instances: /usr/sbin/dnsmasq \ --strict-order \ --bind-interfaces \ --pid-file=/var/run/libvirt/network/default.pid \ --conf-file= \ --listen-address 192.168.122.1 \ --except-interface lo \ --dhcp-range 192.168.122.2,192.168.122.254 \ --dhcp-lease-max=253 > In almost all cases, default mode is better: --bind-interfaces is only > there to cope with old platforms which don't support enough socket > options to do default mode. > > The only time when --bind-interfaces works better is when it's desirable > to run more than one instance of dnsmasq or have dnsmasq co-exist with > another DNS server. This is not possible in default mode, but it does > work in bind-interfaces mode, providing than _all_ instances of dnsmasq > are in bind-interfaces mode, and that they listen on a disjoint set of > interfaces. Yes, for want of any alternative, we currently recommend users with a system instance of dnsmasq to use bind-interfaces, and either --interface or --listen-address http://wiki.libvirt.org/page/Libvirtd_and_dnsmasq > Therefore, to allow multiple dnsmasq instances libvirt's private dnsmasq > instance is started in bind-interfaces mode: that forces one of the > dnsmasq instances to do bind-interfaces. Many of the Linux distibution > dnsmasq packages have now implemented an /etc/dnsmasq.d directory where > configuration fragments can be dropped. Their libvirt packages are > putting a file there which contains a bind-interfaces command, so that > the "system" dnsmasq is automatically forced into the same mode, and the > two can co-exist. > > This works, sort-of, but there some disadvantages. Installing libvirt > drops the configuration change for the system dnsmasq, but the packages > frequently don't restart the system daemon, so that things transiently > fail until everything has rebooted. Much worse, the system dnsmasq is > forced into bind-interfaces mode and then service to transient > interfaces (usb, ad-hoc wifi) no longer works, or, because those > interfaces are mentioned in the dnsmasq configuration, dnsmasq now fails > at start-up when the interfaces don't exist. Yes, this is rather a pain. Aside from the scheme you propose later, there is one other (hacky) way to deal with this - use a udev script to trigger update + reload of the system dnsmasq's configuration when a USB NIC device hotplug/unplug occurs. That is clearly just crude patch over the already serious problem. > My proposal is to get rid of the necessity for two dnsmasq instances. > Libvirt should check for the existance of a "system" dnsmasq and, if the > system daemon exists, libvirt should drop the required configuration > into /etc/dnsmasq.d and then restart it. If the system daemon is not > installed or enabled, libvirt can start a private instance as now. I'm wondering if there's any way we can arrange things so that we will always be able to use a system dnsmasq instance, regardless of whether the host already has it running. My other concern with writing libvirt configs into /etc/dnsmasq.d is that users will then get the impression that this is something that they can freely edit / modify at will. They'll be unhappy with libvirt overwrites their changes whenever it starts. This could perhaps be addressed by allowing use to put the configs into /var/lib/dnsmasq/ instead of /etc/dnsmasq.d, which is more common location for non-user editable configs generated at runtime. Your general plan of having a single dnsmasq instance though does sound desirable, given the way the sockets() APIs work wrt binding to addresses > > The difficulty with this scheme is that libvirt needs to create some > configuration which enables the services it needs on the virtual network > without disturbing, or being disturbed by, whatever configuration exists > for the system daemon. That's not currently possible, but it can be made > possible. I'm assuming that libvirt needs to provide a set of IP > address / MAC address mappings, and range of IP addresses on a virtual > network. It needs DHCP and DNS service on the virtual network. The total set of DNSMASQ args that we currently use are --strict-order --bind-interfaces --domain DOMAIN-NAME (optional) --pid-file=/var/run/libvirt/network/$NETWORK.pid --conf-file= --listen-address=IPADDR-OF-BRIDGE --except-interface=lo --dhcp-range=IPRANGE (optional, multiple times) --dhcp-lease-max=RANGE-SIZE (optional) --dhcp-host=STATIC-HOST-MAPPING (optional) --enable-tftp (optional) --tftp-root=/some/path (optional) --dhcp-boot=PXE-BOOT-SERVER (optional) NB, we explicitly give a NULL conf-file in order to prevent any of the user's settings from the system instance from conflicting with libvirts settings. We don't really want users to be able to specify arbitrary other configuration settings for the libvirt dnsmasq instances, other than those we enable via the libvirt XML configuration. I've used 'optional' to denote flags we only pass when explicitly configured via libvirt's XML format. The others we pass all the time. The 'lease-max' arg we calculate to be exactly matching the number of addresses in the configured dhcp-range args. This is because some of our users had configured dhcp ranges larger than 150 addresses in len. > The dhcp-host IP/MAC mappings are a non-problem: they will be ignored > for any other subnet where the IP addresses don't fit, and any other > dhcp-hosts in the system configuration will be similarly ignored for > DHCP on the virtual network subnet. > > The dhcp-range is more of a problem. Service to particular networks in > dnsmasq is controlled by interface=<interface name"> lines in the > configuration. If there are none of these, service is provided to all > interfaces. If they exist, service is limited to the interfaces > specified. The existence of any dhcp-range line in dnsmasq's > configuration enables the DHCP server for any subnet unless explicitly > limited to particular interfaces. So a default dnsmasq installation, > (with no interface=<interface>) which provides DNS everywhere but DHCP > nowhere would be turned into one which provided DHCP on every interface > by libvirt adding a dhcp-range. Since there wouldn't be a suitable DHCP > range for most subnets, this would only result in logged errors, but it > is still not good. > > Worse, there's no good answer to the question 'should libvirt include > interface=virt0"' in the configuration it supplies? If it does, then the > "enable DHCP on all interfaces" problem is solved, but a default system > configuration with no interface declaration is transformed from one > which provides DNS everywhere to one which provides DNS only to the > virtual interface. If libvirt doesn't provide "interface=virt0" and the > system configuration includes interface declarations, then there will be > no DNS or DHCP service to the virtual network. Historically we did try using 'interface=virbr0' at one time, but we suffered from race conditions with creation of our bridge, so we switched to 'listen-address' instead & assume each host interface has separately configured IP addresses. What would happen was that we'd create the bridge device via an ioctl(), then spawn dnsmasq & it'd exit saying the inteface didn't exist. Adding a sleep(1) after the ioctl() would make it work, so it was clearly some kernel<->userspace race rather than dnsmasq's problem. > To solve this, I propose to add an optional interface name to the > dhcp-range declaration. The semantics of this would be rather odd, but > solve the problem perfectly. > > 1) for DHCP, if any other dhcp-range exists _without_ an interface name, > them the interface name is ignored and and things behave as before, > otherwise DHCP is only provided to interfaces mentioned in dhcp-range > declarations. > > 2) for DNS, if there are no interface declarations, things work as > before. If there are interface declarations, the interfaces mentioned in > dhcp-ranges are added to the set which get DNS service. > > > With these rules, it should be possible for libvirt to drop eg > > dhcp-range=interface:virt0,192.168.0.1,192.168.0.240 > > into the configuration of the system dnsmasq and get DHCP and DNS > service for virt0, irrespective of any other configuration in the system > dnsmasq, and doing so shouldn't affect the services supplied elsewhere. Would this scheme allow libvirt to guarantee that no DHCP is present on its interface ? We currently support running in DNS-only mode, or DNS+DHCP. It is desirable to keep that regardless of how the host's system dnsmasq is currently configured for other interfaces. If I am understanding your suggestion, this allows libvirt to easily enable DNS+DHCP mode on its own interface, without us accidentally enabling DHCP on other host interfaces. If libvirt doesn't use any --dhcp-range flags, there is still a chance that DHCP could be enabled on libvirt's interface if the system dnsmasq had any dhcp-range args. Though assuming the IP ranges don't overlap this should be effectively a no-op ? > The code in libvirt to make this work looks like this: > > echo dhcp-range=interface:virt0,<ip range> >>/etc/dnsmasq.d/libvirt > > if <system dnsmasq is not installed or not enabled> > dnsmasq --interface=virt0\ > --bind-interfaces --conf-file=/etc/dnsmasq.d/libvirt > else > /etc/init.d/dnsmasq restart > > (The --bind-interfaces in the private-dnsmasq instance keeps dnsmasq > from clashing with other nameservers eg BIND which may be running.) > The system dnsmasq package has to ensure that /etc/dnsmasq.d is read for > configuration fragments, and the dnsmasq package and the libvirt package > will have to co-operate to manage transitions between private and system > dnsmasq mode caused by package installation or removal. > Does that make sense? It's a long and involved explanation to come to > cold. I fear I may have over-simplified what libvirt is doing with > dnsmasq, in which case please enlighten me and I'll modify my scheme to > take that into account. If this looks good I can easily have the > necessary dnsmasq changes in the next release. I think you've got the general picture of what we're doing with dnsmasq. At a very high level our original goals were - Support multiple independantly configured networks (virbr0, virbr1, etc) - Isolation between libvirt network interface config & host inteface config - Only support configuratin of options via libvirt network XML format Overall, libvirt aims to provide a standard representation of configuration of services regardless of underlying implementation. Thus ideal would be that end users would not need to know or care that libvirt was using dnsmasq as its implementation. Obviously we're failing here due to the inevitable conflict with the system dnsmasq that operates in wildcard addressing mode. Your proposal certainly helps us deal with that conflict in a better way. My main concern is that it has the potential to significantly reduce the isolation of configuration between interfaces. eg, does libvirt's use of the --enable-tftp arg suffer from the same problem as --dhcp-range, where libvirt setting it for one interface inadvertantly enables it for all others This would seem to imply that many other dnsmasq arguments would need to gain an extra 'interface' parameter to restrict their scope, which sounds like quite a burden for your code to support ? In essence we're trying to have 1 single dnsmasq process, but at the same time ensure that everything in the extra /etc/dnsmasq.d/libvirt-virbr0 file is scoped to a single interface. I could almost see that file containing 'scope=virbr0' as a short-cut for saying that every config flag listed there only apply to that one interface. Nb, I've not said it explicitly, but although the default libvirt config starts with a single dnsmasq instance attached to virbr0 interface, we have the ability to start many dnsmasq instances each on a different bridge device. On a completely unrelated topic, do you have any plans to support IPv6 in dnsmasq in the future ? eg things like DHCPv6, listen on IPv6 for DNS requests, and serving of AAAA records. Regards, Daniel -- |: Red Hat, Engineering, London -o- http://people.redhat.com/berrange/ :| |: http://libvirt.org -o- http://virt-manager.org -o- http://deltacloud.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: GnuPG: 7D3B9505 -o- F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505 :| -- libvir-list mailing list libvir-list@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/libvir-list