On 9/16/19 10:44 AM, Nikolay Shirokovskiy wrote:
Hi, all I recently discovered that <target dev=''/> input is allowed[*]
So you're saying this previously failed? Catastrophically (SEGV)? Or did it log an error and refuse to start the domain?
by qemu driver's defineXML for <interface> element. On start it will turn into 'tap<N>'. At the same time empty value is not allowed by schema. This name is generated by kernel on linux.
Hmm, yeah I hadn't ever thought about it, but I guess that makes sense, since that's the way the tunctl utility works by default (opens /dev/net/tun, then does a TUNSETIFF ioctl with an empty string as the name of the device), and it get devices name tap<N>.
I wonder how to deal with this. 1. Fix schema to allow empty value for dev attribute. Additionally fix migrating to clear out this autogenerated name. This way we have 2 options to autogenerate dev name. First as described above. Second is by ommiting target element. The naming will be 'vnet<N>' in the latter case. Having these 2 options looks redundant.
Yeah, this option is bad/undesirable.
2. Don't allow to start domains with such configuration. We can not fail definition because disappearing VMs on libvirt update looks too unpleasant.
1) (Nevermind - you answered below) How long has this worked? If it's *very* new, and defining a domain with <target dev=''/> used to fail, then it's actually a *good* thing to fail immediately.
2) If you put the check for present-but-empty target dev in virDomainNetDefValidate(), then the validation will be done any time a new device is parsed, but specifically *won't* be done during the domain parse when libvirtd starts up. This avoids the "disappearing domains" problem you mention (and is specifically why the separate validation functions were added - there is one hypervisor-agnostic function for each type of device, and a separate function for the entire domain (all in conf/domain_conf.c), as well as similar hypervisor-specific validation functions in qemu/qemu_domain.c (look for the functions below qemuDomainDeviceDefValidate(), e.g. the function for network interfaces is qemuDomainDeviceDefValidateNetwork(); no, don't ask me why the name uses a different pattern than the hypervisor-agnostic functions!).
(You'll want to add this validation in the hypervisor-agnostic function, BTW.)
This way we don't have two different set of autogenerated names. This can even go unnoticed given we fail such starts anyway after [1] (which is fixed by [*]).
So you're saying that <target dev=''/> has worked since "forever" (as far as you know), until it stopped working in 4.6.0, and the patch you posted this morning fixes it, right?
I vote for putting a check in virDomainNetDefValidate() that errors out if it finds target dev present but empty. There may be some who would say "it's existing behavior and has been like this for a long time, so we have to preserve it", but since the schema doesn't allow it and we've never documented it, I think it's reasonable to disallow it (except for, as you say, existing domain definitions at reboot time; NB: there will still be an error if a domain containing this now-illegal setting is ever edited).
[1] is commited on 2018 Jul 23 and released in 4.6.0. However Centos 7 is based on 4.5.0 and virNetDevTapCreate has logic to copy generated tap name from kernel like since the beginning (I managed to follow the history to 2011).
Yes, we rely on it to fill in the numbers of the vnet<N> devices. That way we let the kernel prevent duplicate device names rather than needing to (attempt to) prevent races between threads, as we have to do with macvtap devices (since macvtap doesn't have this capability).
Nikolay [*] at least with recent "virStrncpy: fix to successfully copy empty string" applied [1] 7d70a63b util: Improve virStrncpy() implementation
-- libvir-list mailing list libvir-list@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/libvir-list