On 07/25/2018 09:51 AM, Ján Tomko wrote: > On Wed, Jul 25, 2018 at 08:39:26AM -0400, John Ferlan wrote: >> >> >> On 07/25/2018 06:40 AM, Michal Privoznik wrote: >>> On 07/16/2018 11:14 PM, John Ferlan wrote: >>>> https://bugzilla.redhat.com/show_bug.cgi?id=1559867 >>>> >>>> When attaching a device to the domain we need to be sure >>>> to use the correct domain definition (vm->def or vm->newDef) >>>> when calling virDomainDeviceDefParse because the post parse >>>> processing algorithms that may assign an address for the >>>> device will use whatever domain definition was passed in. >>>> > > This effectively reduces AttachDevice(AFFECT_LIVE | AFFECT_CONFIG) to > two subsequent AttachDevice calls, thus possibly attaching two > different devices, making the AFFECT_CONFIG option pointless. > Ironically, the v2 response you gave was: " IMO this is a step into the right direction - rather than tuning up the device to be compatible with the live definition and hoping it will work in the persistent definition is just naive. " Still what I did doesn't change the AttachDevice calls. Prior to my changes and after my changes there is a qemuDomainAttachDeviceConfig for CONFIG and a qemuDomainAttachDeviceLive for LIVE. What the change does is separate the virDomainDeviceDefParse calls such that the "config" one is made with the persistent def rather than the live def for both. Thus if some previous change updates persistent, then the next change to just config doesn't miss that and perhaps duplicate something not provided - in this case the <address... unit=0> was duplicated which results in subsequent start failure. So looking at the old and new code side by side - how exactly is new code pointless? What subtle point am I missing? > E.g. when hotplugging a network interface, it might end up both on a > different PCI slot and with a different MAC address, which I'm afraid > might break some use cases. > An example would have greatly helped... So, I assume you mean this: # cat n1.xml <interface type='network'> <source network='default'/> <model type='virtio'/> </interface> # virsh attach-device $dom n1.xml --live --config Device attached successfully # virsh dumpxml $dom | grep interface -A 10 ... <interface type='network'> <mac address='52:54:00:8a:bb:ea'/> <source network='default' bridge='virbr0'/> <target dev='vnet1'/> <model type='virtio'/> <alias name='net1'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x0d' function='0x0'/> </interface> ... # virsh dumpxml $dom --inactive | grep interface -A 10 ... <interface type='network'> <mac address='52:54:00:be:29:3c'/> <source network='default'/> <model type='virtio'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x0d' function='0x0'/> </interface> ... whereas, previously the same MAC address would have been generated for both live and config based on the current live def. Is that it? One other consideration. Prior to my changes. Assume the same n1.xml, then attach-device --config only, check out the resulting address using slot=0xd. Now use the --config --live - for the config guest you'll see that some MAC gets added at slot=0x0e while the same MAC gets added to slot=0x0d for the live guest. So I would contend this is no different than my changes other than with the changes the next guest wouldn't have the duplicated MAC from the live guest so the MAC being at a different address shouldn't matter. So the downside to libvirt generating a different MAC for live and config ends up being is what? The guest will boot the next time, but with a different MAC for the network device. If though a consumer doesn't provide one, how much of a problem is that compared to not being able to boot the guest because the <address ... unit=#> conflict for disks. This is certainly one of those "competing interests" type problems for generated data. I really don't know the "perfect answer" other than telling customers to provide the specific things they want to avoid these conflicts. One could argue that if one wants a predictable MAC, then they should provide it. Similar argument for the <address>. The one difference between the two for me being it's not possible to boot the machine with address conflicts; whereas, a change in an unprovided MAC is kind of a WYSIWYG type problem. >>>> Additionally, some devices (SCSI hostdev and SCSI disk) use >>>> algorithms that rely on knowing what already exists of the >>>> other type when generating the new device's address. Using >>>> the wrong VM definition could result in duplicated addresses. >>>> > > If suitablity for both live and persistent definition is a problem, > the address allocation code should take both domain definitions into > account and generate a single address for both device copies. > Could be a real challenge to do both - the algorithms aren't exactly "simple" now, especially when one has to take into account SCSI disks and SCSI hostdevs "share" the same bus. Not sure it'll work, but perhaps a different option - don't assign an address to a device that's being cold plugged while the domain is active. Although there's enough different types of addresses and devices that I'm not sure that's feasible for each type. >>>> In the case of the bz, two hostdev's with no domain address >>>> provided were added to the running domain's config only. >>>> However, the parsing algorithm used the live domain in >>>> order to figure out the host device address resulting in >>>> the same address being used and a subsequent start failing >>>> due to duplicate address. >>>> >>>> Fix this by separating the checks/code into CONFIG and LIVE >>>> processing using the correct definition for each block and >>>> performing cleanup for both options as necessary. >>>> >>>> Signed-off-by: John Ferlan <jferlan@xxxxxxxxxx> >>>> --- >>>> src/qemu/qemu_driver.c | 52 +++++++++++++++++++----------------------- >>>> 1 file changed, 23 insertions(+), 29 deletions(-) > > As I said in my review of v2 I still consider this a waste of energy and > we might possibly end up having to revert this change because it breaks > someone's use case. > > Jano For reference, the v2 review: https://www.redhat.com/archives/libvir-list/2018-June/msg01012.html And there were differing opinions related to whether it was a waste of energy. In the long run, IDC if this patch is reverted and the bz is closed as impossible or won't fix or whatever magic wording needs to be used if that's what you prefer/desire. John -- libvir-list mailing list libvir-list@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/libvir-list