On 01/27/2016 12:26 PM, Andrea Bolognani wrote: > On Tue, 2016-01-26 at 18:55 -0500, John Ferlan wrote: >> >> w/r/t: your [0/7] from initial series... >> >> As much as you don't want to keep living Groundhog Day - resolution of >> bugs like this are job security :-)... > > Groundhog Day is in less than a week, by the way! :) > >> w/r/t Suggestions for deamon restart issues... Seems like we need a way >> to save/restore the hostdev_mgr active/inactive lists using XML/JSON >> similar to how domains, storage, etc. handle it. Guess I just assumed >> that was already there ;-) since /var/run/libvirt/hostdevmgr exists. It >> seems that network stuff can be restored - virHostdevNetConfigRestore. >> >> Do you really think this series should be "held up" waiting to create >> some sort of status tracking? > > I will look into your suggestion. I believe such save / restore > functionality has to be in place by the time this series is merged if > we don't want to break everything on daemon restart. > I assume that restart is already broken... This series does fix some issues in the "normal" flow though. Perhaps a chicken and egg type problem. If you fix restart first, what of this series would be beneficial to ensure restart doesn't have issues... >> On 01/25/2016 11:20 AM, Andrea Bolognani wrote: >>> This function replaces virHostdevReattachPCIDevice() and, unlike it, >>> does not perform list manipulation, leaving it to the calling function. >>> >>> This means virHostdevReAttachPCIDevices() had to be updated to cope >>> with the updated requirements. >>> --- >>> src/util/virhostdev.c | 136 +++++++++++++++++++++++++++++++++----------------- >>> 1 file changed, 90 insertions(+), 46 deletions(-) >> >> Since I reviewed them all... I think the comment changes from 7/9 should >> just be inlined here and patch 4 instead of a separate patch > > Will do - it was that way in v1 as well. > Right - I started looking at v2 >>> diff --git a/src/util/virhostdev.c b/src/util/virhostdev.c >>> index f31ad41..66629b4 100644 >>> --- a/src/util/virhostdev.c >>> +++ b/src/util/virhostdev.c >>> @@ -526,6 +526,74 @@ virHostdevNetConfigRestore(virDomainHostdevDefPtr hostdev, >>> return ret; >>> } >>> >>> +/** >>> + * virHostdevOnlyReattachPCIDevice: >> >> Why not just reuse the function name you deleted? IOW: Is there a reason >> for "Only"? (not that I'm one that can complain about naming functions, >> but this just seems strange). > > It's an attempt to make it stand out a bit from > > virHostdevPCINodeDeviceReAttach() > virHostdevReAttachPCIDevices() > > in the same file. Mostly the latter. > > The reasoning behind "Only" is that the function performs "Only" the job > of reattaching the device to the host, with the error checking, > bookkeeping and additional steps left to the caller. > > Which is, strictly speaking, not true :) > > Maybe something like virHostdevReattachPCIDeviceCommon(), to express the > fact that this basically contains as much functionality as it was > possible to split off to a reusable routine? > virHostdevReattachPCIDeviceVeryCarefully :-) But since it's in the comment of the code: virHostdevReattachPCIDeviceToHost >>> + * @mgr: hostdev manager >>> + * @pci: PCI device to be reattached >> >> Interesting ... In 2 instances, this will be a pointer to the "copy" >> element, while in the third instance this will be the "actual" on >> inactive list element. For a copy element, we'd *have* to search >> inactive; however, for an 'actual' we don't "need" to. > > Good point. > > I will try to find a solution that > > 1. avoids searching the list twice > 2. avoids duplicating code > 3. respects the Principle of Least Surprise > > I can't guarantee I'll be able to :) > I kept losing focus on when something was on the inactive list or not. Then of course trying to reconcile 'pci' and 'dev' variable name usage. >>> + * @skipUnmanaged: whether to skip unmanaged devices >>> + * >>> + * Reattach a PCI device to the host. >>> + * >>> + * This function only performs the base reattach steps that are required >>> + * regardless of whether the device is being detached from a domain or >>> + * had been simply detached from the host earlier. >>> + * >>> + * @pci must have already been marked as inactive, and the PCI related >>> + * parts of @mgr (inactivePCIHostdevs, activePCIHostdevs) must have been >>> + * locked beforehand using virObjectLock(). >>> + * >>> + * Returns: 0 on success, <0 on failure >>> + */ >>> +static int >>> +virHostdevOnlyReattachPCIDevice(virHostdevManagerPtr mgr, >>> + virPCIDevicePtr pci, >>> + bool skipUnmanaged) >>> +{ >>> + virPCIDevicePtr actual; >>> + int ret = -1; >>> + >>> + /* Retrieve the actual device from the inactive list */ >>> + if (!(actual = virPCIDeviceListFind(mgr->inactivePCIHostdevs, pci))) { >>> + VIR_DEBUG("PCI device %s is not marked as inactive", >>> + virPCIDeviceGetName(pci)); >> >> This is tricky - the only time we care about the return status (now) is >> when skipUnmanaged == false, a/k/a the path where we pass the onlist >> element.. >> >> In my first pass through the changes I thought - oh no if we return -1, >> then this would be a path that could get that generic libvirt failed for >> some reason message; however, closer inspection noted that we guaranteed >> it was on the inactive list before calling here. > > So we should be good, right? :) > I think so - a lot of typing out loud which at least gives you my review context... I finally convinced myself there wasn't an issue even though it's strange to return something and in the end, no one really cares... Perhaps using a 4th parameter 'actual' (could be NULL) would make it more obvious that we knew on input that the device was already found on the inactive list. Determining whether we have a copy or an actual wasn't readily apparent. Consider some future "user" of this function - how easy is it to decide what to pass? >>> + goto out; >>> + } >>> + >>> + /* Skip unmanaged devices if asked to do so */ >>> + if (!virPCIDeviceGetManaged(actual) && skipUnmanaged) { >> >> <sigh>, unrelated and existing - virPCIDeviceGetManaged probably should >> return bool instead of unsigned int > > Yup, good catch. The same applies to > > virPCIDeviceGetUnbindFromStub() > virPCIDeviceGetRemoveSlot() > virPCIDeviceGetReprobe() > > as well. I'll fix them in a separate commit. > I saw that today... >>> + VIR_DEBUG("Not reattaching unmanaged PCI device %s", >>> + virPCIDeviceGetName(actual)); >>> + ret = 0; >>> + goto out; >>> + } >>> + >>> + /* Wait for device cleanup if it is qemu/kvm */ >>> + if (virPCIDeviceGetStubDriver(actual) == VIR_PCI_STUB_DRIVER_KVM) { >>> + int retries = 100; >>> + while (virPCIDeviceWaitForCleanup(actual, "kvm_assigned_device") >>> + && retries) { >>> + usleep(100*1000); >>> + retries--; >>> + } >>> + } >> >> Existing, but if retries == 0, then cleanup never succeeded; however, >> perhaps one can assume that the subsequent call would fall over and play >> dead? Not that it really gets checked... > > I recall raising the issue at some point in the past, but I don't > remember the outcome of that discussion... Maybe this can be assessed > again at a later time? > That's fine - it was just another of those typing out loud moments. >>> + >>> + VIR_DEBUG("Reattaching PCI device %s", virPCIDeviceGetName(actual)); >>> + if (virPCIDeviceReattach(actual, mgr->activePCIHostdevs, >>> + mgr->inactivePCIHostdevs) < 0) { >>> + virErrorPtr err = virGetLastError(); >>> + VIR_ERROR(_("Failed to reattach PCI device %s: %s"), >>> + virPCIDeviceGetName(actual), >>> + err ? err->message : _("unknown error")); >>> + virResetError(err); >>> + goto out; >>> + } >>> + >>> + ret = 0; >>> + >>> + out: >>> + return ret; >>> +} >>> + >>> int >>> virHostdevPreparePCIDevices(virHostdevManagerPtr hostdev_mgr, >>> const char *drv_name, >>> @@ -753,45 +821,6 @@ virHostdevPreparePCIDevices(virHostdevManagerPtr hostdev_mgr, >>> return ret; >>> } >>> >>> -/* >>> - * Pre-condition: inactivePCIHostdevs & activePCIHostdevs >>> - * are locked >>> - */ >>> -static void >>> -virHostdevReattachPCIDevice(virPCIDevicePtr dev, virHostdevManagerPtr mgr) >>> -{ >>> - /* If the device is not managed and was attached to guest >>> - * successfully, it must have been inactive. >>> - */ >>> - if (!virPCIDeviceGetManaged(dev)) { >>> - VIR_DEBUG("Adding unmanaged PCI device %s to inactive list", >>> - virPCIDeviceGetName(dev)); >>> - if (virPCIDeviceListAdd(mgr->inactivePCIHostdevs, dev) < 0) >>> - virPCIDeviceFree(dev); >>> - return; >>> - } >>> - >>> - /* Wait for device cleanup if it is qemu/kvm */ >>> - if (virPCIDeviceGetStubDriver(dev) == VIR_PCI_STUB_DRIVER_KVM) { >>> - int retries = 100; >>> - while (virPCIDeviceWaitForCleanup(dev, "kvm_assigned_device") >>> - && retries) { >>> - usleep(100*1000); >>> - retries--; >>> - } >>> - } >>> - >>> - VIR_DEBUG("Reattaching PCI device %s", virPCIDeviceGetName(dev)); >>> - if (virPCIDeviceReattach(dev, mgr->activePCIHostdevs, >>> - mgr->inactivePCIHostdevs) < 0) { >>> - virErrorPtr err = virGetLastError(); >>> - VIR_ERROR(_("Failed to re-attach PCI device: %s"), >>> - err ? err->message : _("unknown error")); >>> - virResetError(err); >>> - } >>> - virPCIDeviceFree(dev); >>> -} >>> - >>> /* @oldStateDir: >>> * For upgrade purpose: see virHostdevNetConfigRestore >>> */ >>> @@ -803,7 +832,7 @@ virHostdevReAttachPCIDevices(virHostdevManagerPtr hostdev_mgr, >>> int nhostdevs, >>> const char *oldStateDir) >>> { >>> - virPCIDeviceListPtr pcidevs; >>> + virPCIDeviceListPtr pcidevs = NULL; >>> size_t i; >>> >>> if (!nhostdevs) >>> @@ -848,11 +877,25 @@ virHostdevReAttachPCIDevices(virHostdevManagerPtr hostdev_mgr, >>> continue; >>> } >>> } >>> + i++; >>> + } >> >> Curious why the decision for a second loop - if activeDev matches, then >> it almost seems a shame to loop again. Why not (within if (activeDev): >> >> actual = virPCIDeviceListSteal(hostdev_mgr->activePCIHostdevs, >> activeDev); >> /* !actual should never happen, but better safe than sorry */ >> if (actual && >> virPCIDeviceListAdd(hostdev_mgr->inactivePCIHostdevs, >> actual) < 0) >> virPCIDeviceFree(actual); >> /* You could also... */ >> virPCIDeviceListDel(pcidevs, dev); >> } > > Mostly because I consider moving the devices from one list to another > as a separate step. > > We could merge the two steps, and that would bring us down to 4 (or 5 > if you count the one implicit in virHostdevGetActivePCIHostDeviceList()) > loops, but I'm not sure whether that would be a significant gain in > performance or whether it would just make the code a little more > convoluted... > Your call - we go through the pcidevs list many times so it's not that big a deal... >> NOTE: Whether there is one or two loops, the second loop would need to >> call virPCIDeviceFree(actual) since we'd leak it otherwise. > > You mean on error? Because otherwise I don't see the leak: the actual > device is stolen from the active list and added (itself, not a copy) > to the inactive list. > Yes, the error path - if you fail to add actual to inactive, then it was dropped on the floor >> You'll also note I didn't keep the goto cleanup. Previously the code >> would completely go through the pcidevs Loop 4 regardless of whether >> virHostdevReattachPCIDevice code had failures. The new code has the >> subtle difference of jumping to cleanup if a failure is found. That >> could leave things in an awkward state especially since >> virHostdevReAttachPCIDevices is a void. >> >> Since failure for DeviceListAdd is because 1. device is already there >> (which I would *hope* isn't the case) or 2. memory allocation failure >> (in which case not much else successful will probably happen anyway), >> then perhaps continuing on rather than jumping to cleanup isn't a bad >> idea... We could be returning some memory that someone may find useful. >> >> My concerns about jumping to cleanup are that this API is called in >> error recovery paths as well as part of the ominous comment "For upgrade >> purpose:..." (comment before function header). So it seems the "existing >> logic" is try to clean up as many as possible. By potentially erroring >> out too soon could lead to more problems. >> >> So the question becomes what havoc is introduced if we cannot add to the >> inactive list but decide to continue as before... It seems we'll end up >> "failing" in virHostdevOnlyReattachPCIDevice since it's not in the >> inactiveList, but our Loop 4 logic doesn't care. Of course we could >> delete 'dev' from 'pcidevs' too before then... >> >> Hopefully this makes sense... It's been an 'edit in process'... > > See the comment at the end of the message. > >>> + >>> + /* Step 2: move all devices from the active list to the inactive list */ >>> + for (i = 0; i < virPCIDeviceListCount(pcidevs); i++) { >>> + virPCIDevicePtr dev = virPCIDeviceListGet(pcidevs, i); >>> + virPCIDevicePtr actual; >>> >>> VIR_DEBUG("Removing PCI device %s from active list", >>> virPCIDeviceGetName(dev)); >>> - virPCIDeviceListDel(hostdev_mgr->activePCIHostdevs, dev); >> >> This was a curious placement for *ListDel... If !activeDev, then calling >> *ListDel also won't find 'dev' on activelist... > > If 'activeDev != NULL', then driver name and domain name are checked, > which may cause the 'dev' to be removed from 'pcidev' and the loop to > restart. > > If that does not happen 'dev' is removed from the active list, even > thought it might not have been in that list in the first place. But > the code is doing the right thing in all situations. > Understood, but if !activeDev, then the virPCIDeviceListDel calls virPCIDeviceListSteal which calls virPCIDeviceListStealIndex using virPCIDeviceListFindIndex... You'll note activeList is sourced by calling virPCIDeviceListFind which calls virPCIDeviceListFindIndex So I was pointing out how pointless it would be to call virPCIDeviceListDel if activeDev == NULL (because it too would do nothing). >>> - i++; >>> + if (!(actual = virPCIDeviceListSteal(hostdev_mgr->activePCIHostdevs, >>> + dev))) >>> + goto cleanup; >> >> If the choice is to use two loops (and perhaps keep the cleanups)... >> >> 1. If this Steal fails, then something is seriously wrong, but we don't >> even give it a VIR_DEBUG message. >> >> 2. If this Steal fails, then going to cleanup is again a subtle >> difference with the prior logic that said, well I couldn't do anything >> with this, but I'm going to keep processing. >> >> 3. If we keep processing, then something on 'pcidevs' won't be in >> 'inactivePCIHostdevs', causing virHostdevOnlyReattachPCIDevice to fail. >> But that does not matter since we ignore the return value in Loop 4. >> >> 4. If we do Steal and if the subsequent Add fails, then we leak >> 'actual', so prior to the goto cleanup call virPCIDeviceFree(actual); >> (or instead if the goto cleanup;'s are removed). > > Thanks for looking into this in such detail. I will go through the > existing code again myself and either become confident that the code > is doing the right thing or change it so that it does :) > > On the other hand, there's this patch I'm working on that changes the > way bookkeeping is performed quite substantially... My idea was to > propose it as a follow-up to this series, since it basically replaces > some constructs with some other "equivalent" constructs without altering > the overall control flow, but maybe at this point it could be worth it > to merge everything together and hopefully avoid such pitfalls, and make > the whole thing easier to reason about. Try to keep it under 100 patches please ;-) John > > Cheers. > > -- > Andrea Bolognani > Software Engineer - Virtualization Team > -- libvir-list mailing list libvir-list@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/libvir-list