On Tue, 2016-03-22 at 07:45 -0400, John Ferlan wrote: > On 03/17/2016 01:24 PM, Andrea Bolognani wrote: > > > > Unmanaged devices are attached to guests in two steps: first, > > the device is detached from the host and marked as inactive; > > subsequently, it is marked as active and attached to the guest. > > > > If the daemon is restarted between these two operations, we lose > > track of the inactive device. > > > > Steps 5 and 6 of virHostdevPreparePCIDevices() already subtly > > take care of this situation, but some planned changes will make > > it so that's no longer the case. Plus, explicit is always better > > than implicit. > > --- > > src/util/virhostdev.c | 47 ++++++++++++++++++++++++++++++++++++++++++++--- > > 1 file changed, 44 insertions(+), 3 deletions(-) > > > > diff --git a/src/util/virhostdev.c b/src/util/virhostdev.c > > index e0d6465..7204bd7 100644 > > --- a/src/util/virhostdev.c > > +++ b/src/util/virhostdev.c > > @@ -560,7 +560,8 @@ virHostdevPreparePCIDevices(virHostdevManagerPtr mgr, > > } > > } > > > > - /* Step 2: detach managed devices (i.e. bind to appropriate stub driver) */ > > + /* Step 2: detach managed devices and make sure unmanaged devices > > + * have already been taken care of */ > > for (i = 0; i < virPCIDeviceListCount(pcidevs); i++) { > > virPCIDevicePtr pci = virPCIDeviceListGet(pcidevs, i); > > > > @@ -577,8 +578,48 @@ virHostdevPreparePCIDevices(virHostdevManagerPtr mgr, > > mgr->inactivePCIHostdevs) < 0) > > goto reattachdevs; > > } else { > > - VIR_DEBUG("Not detaching unmanaged PCI device %s", > > - virPCIDeviceGetName(pci)); > > + char *driverPath; > > + char *driverName; > > + int stub; > > stub = -1; May not be needed, see below. > > + > > + /* Unmanaged devices should already have been marked as > > + * inactive: if that's the case, we can simply move on */ > > + if (virPCIDeviceListFind(mgr->inactivePCIHostdevs, pci)) { > > + VIR_DEBUG("Not detaching unmanaged PCI device %s", > > + virPCIDeviceGetName(pci)); > > + continue; > > + } > > + > > + /* If that's not the case, though, it might be because the > > + * daemon has been restarted, causing us to lose track of the > > + * device. Try and recover by marking the device as inactive > > + * if it happens to be bound to a known stub driver. > > + * > > Just to make it clearer (for me) - this is because using the node device > API virHostdevPCINodeDeviceDetach it's possible that somewhere between > virPCIDeviceBindToStub and the virPCIDeviceListAddCopy we had a daemon > restart causing us to lose track of the device, so this code tries to > detect that the BindToStub was done and complete the task. > > Of course it's also possible that BindToStub was "in process" when we > failed, but I'm not sure *how* to detect that... I suppose though all > that matters is we had a completed BindToStub. Mh, not really what I had in mind :) What this code is meant to take care of is the case when virHostdevPCINodeDeviceDetach() completes successfully, marking the device as inactive, but the daemon is restarted before the device can be attached to a guest. That causes us to lose all information in the inactive list. The current code copes with this because step 5 adds the device to the active list (even if it was never in the inactive list) and step 6 removes the device from the inactive list (even if it was never in the inactive list), but that's extremely opaque and most importantly will stop working once we start moving device objects from one list to the other instead of copying them around. The new code deals with the situation explicitly by adding devices to the inactive list if they look like they belong there - namely, they are bound to a stub driver. As noted in the comment right below, this is sub-optimal and a follow-up patch should take care of storing both the active and inactive list to disk properly so that daemon restarts are no longer a problem. That's for another day, though :) > > + * FIXME Get rid of this once a proper way to keep track of > > + * information about active / inactive device across > > + * daemon restarts has been implemented */ > > + > > + if (virPCIDeviceGetDriverPathAndName(pci, > > + &driverPath, &driverName) < 0) > > + goto reattachdevs; > > I note that virPCIDeviceUnbindFromStub can return 0, but driverName == > NULL. If driverName == NULL, then it declares it's not bound to a known > stub. However, for our purposes, it more than likely means the > BindToStub may not have completed successfully. But still we only *care* > that it did. > > > + > > + stub = virPCIStubDriverTypeFromString(driverName); > > So, I think this becomes, > > if (driverName) > stub = virPCIStubDriverTypeFromString(driverName); virPCIStubDriverTypeFromString() is implemented using virEnumFromString(), which handles NULL gracefully and simply returns -1. That's why I think initializing stub above and adding the check on driverName are not required changes, but I'm not opposed to adding them if you think they make the code clearer. > > + > > + VIR_FREE(driverPath); > > + VIR_FREE(driverName); > > + > > + if (stub >= 0 && > > + stub != VIR_PCI_STUB_DRIVER_NONE) { > > Strange check since VIR_PCI_STUB_DRIVER_NONE = 0 > > if (stub >= 0 && stub != 0) > > a change to seems to be what you're after > > if (stub > VIR_PCI_STUB_DRIVER_NONE && > stub < VIR_PCI_STUB_DRIVER_LAST) The idea here is that 'stub >= 0' makes sure that virPCIStubDriverTypeFromString() didn't fail, and the second check excludes a value that, while part of the virPCIStub enumeration, we want to reject in this case. But I like your version better, so consider it changed :) > > + /* The device is bound to a known stub driver: store this > > + * information and add a copy to the inactive list */ > > + virPCIDeviceSetStubDriver(pci, stub); > > + > > + VIR_DEBUG("Adding PCI device %s to inactive list", > > + virPCIDeviceGetName(pci)); > > + if (virPCIDeviceListAddCopy(mgr->inactivePCIHostdevs, pci) < 0) > > + goto reattachdevs; > > + } > > I think perhaps patch 2 should be merged into here - I was kind of > wondering what the "else" condition would be... Until I looked at patch 2. > > ACK with the adjustments - including the merge; otherwise, we fall into > Step 3 still. Patch 2 actually deals with a separate issue, so I'm not sure squashing them together is warranted. If the 'else' branch is missing here, we'll just keep on doing what we're already doing: try to pass a device to the guest, and have QEMU spew out a kinda cryptic error message when it realizes it can't work the VFIO magic on it. Cheers. -- Andrea Bolognani Software Engineer - Virtualization Team -- libvir-list mailing list libvir-list@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/libvir-list