On 02/12/2015 04:17 AM, Zhang Bo wrote: > If we assign a VF, which has already been used by an active guest, to another > guest, and try to start the 2nd guest later on, the 2nd guest would not > start, and the VF won't work anymore. > > Steps to reproduce the problem: > 1 Assign a VF to guest A, and start the guest. The VF works fine. > 2 Assign the VF to guest B, and try to start guest B. guest B can't start. > 3 Guest A's network becomes unreachable, because its VF now doesn't work. > > Reasons for this problem is: > 1 When we start guest B, libvirtd checks whether the VF is already used > by another guest, if so, qemuPrepareHostDevices() returns with failure. > 2 Then, libvirtd calls qemuProcessStop() to cleanup resourses, which would > restore the VFs of guest B. Specifically, it reads > /var/run/libvirt/hostdevmgr/ethX_vfX to get the VF's original MAC/VLAN, > and set it back to current VF. > 3 As that the VF is still in use by guest A, libvirtd just set its MAC/VLAN > to another value, the VF doesn't work anymore. > > Detailed flow: > qemuProcessStart > \___qemuPrepareHostDevices(if it fails, goto cleanup) > \ \_qemuPrepareHostdevPCIDevices > \ \_____virHostdevPreparePCIDevices > \ \____LOOP1:virPCIDeviceListFind > \ (whether the device is in use by another active guest) > \ if the VF has been assigned to, qemuPrepareHostDevices() fails > \___cleanup: > \__qemuProcessStop > \____qemuDomainReAttachHostDevices > \____qemuDomainReAttachHostdevDevices > \____virHostdevReAttachPCIDevices > \_____virHostdevNetConfigRestore > (it gets MAC/VLAN form /var/run/libvirt/hostdevmgr/ethX_vfX, > and set it back to the VF, making the VF unusable) > > This patch checks whether the VF is already in use before restoring it. > > Signed-off-by: Zhang Bo <oscar.zhangbo@xxxxxxxxxx> > Signed-off-by: Zhuang Yanying <zhuangyanying@xxxxxxxxxx> > --- > src/util/virhostdev.c | 14 +++++++++++++- > 1 file changed, 13 insertions(+), 1 deletion(-) > > diff --git a/src/util/virhostdev.c b/src/util/virhostdev.c > index 9678e2b..ee19400 100644 > --- a/src/util/virhostdev.c > +++ b/src/util/virhostdev.c > @@ -816,9 +816,21 @@ virHostdevReAttachPCIDevices(virHostdevManagerPtr hostdev_mgr, > * For SRIOV net host devices, unset mac and port profile before > * reset and reattach device > */ > - for (i = 0; i < nhostdevs; i++) > + for (i = 0; i < nhostdevs; i++){ > + virPCIDevicePtr devNotInuse = NULL; > + virDevicePCIAddressPtr addr = NULL; > + virDomainHostdevDefPtr hostdev = hostdevs[i]; > + addr = &hostdev->source.subsys.u.pci.addr; > + devNotInuse = virPCIDeviceListFindByIDs(pcidevs, > + addr->domain, addr->bus, > + addr->slot, addr->function); > + if (!devNotInuse) { > + continue; > + } This may work, but it's not the proper fix. Rather than checking to be sure the device is not in use by any domain, it should be checking to be sure that it *is* in use by this domain (and that should be set somehow/somewhere as the devices are being prepared prior to starting the domain, and *reset* during this function). The way it is now, if there is a device in the domain that we never even reached during qemuProcessStart() it will still get reattached (and all other anti-setup related with the device) even though it isn't necessary (and could be the wrong thing to do). Unfortunately I only came into the hostdev stuff somewhat after the fact, haven't looked at any of it in a long time, and it's all been moved into a hypervisor-agnostic library since then, so I can't offhand say the best way to do this (whether to mark the device in the hostdevmgr with the name of the domain, or to mark the device object in the domaindef). I'd need to spend more time reacquainting myself with the code first. > + > virHostdevNetConfigRestore(hostdevs[i], hostdev_mgr->stateDir, > oldStateDir); > + } > > for (i = 0; i < virPCIDeviceListCount(pcidevs); i++) { > virPCIDevicePtr dev = virPCIDeviceListGet(pcidevs, i); -- libvir-list mailing list libvir-list@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/libvir-list