On 04/22/2015 08:02 AM, John Ferlan wrote: > > On 04/15/2015 01:29 PM, Huanle Han wrote: >> Fix for such a case: >> 1. Domain A and B xml contain the same SRIOV net hostdev(<interface >> type='hostdev' /> with same pci address). >> 2. virsh start A (Successfully, and configure the SRIOV net with >> custom mac) >> 3. virsh start B (Fail because of the hostdev used by domain A or other >> reason.) >> In step 3, 'virHostdevNetConfigRestore' is called for the hostdev >> which is still used by domain A. It makes the mac/vlan of the SRIOV net >> change. >> >> Code Change in this fix: >> 1. As the pci used by other domain have been removed from >> 'pcidevs' in previous loop, we only restore the nic config for >> the hostdev still in 'pcidevs'(used by this domain) >> 2. wrap a function 'virHostdevIsPCINetDevice', which detect whether the >> hostdev is a pci net device or not. >> 3. update the comments to make it more clear >> >> Signed-off-by: Huanle Han <hanxueluo@xxxxxxxxx> >> --- >> src/util/virhostdev.c | 70 ++++++++++++++++++++++++++++++++++----------------- >> 1 file changed, 47 insertions(+), 23 deletions(-) >> > This seems reasonable - although it should have been two patches - the > first just to create virHostdevIsPCINetDevice and the second to fix the > issue and update the comments... I agree. That change obscures the functional change. > Something I can take care of for you > before pushing... > > Hopefully Laine can also take a quick look at it and verify the actions > since he's most familiar with the SRIOV code path... > > John >> diff --git a/src/util/virhostdev.c b/src/util/virhostdev.c >> index f583e54..a2719d3 100644 >> --- a/src/util/virhostdev.c >> +++ b/src/util/virhostdev.c >> @@ -350,6 +350,14 @@ virHostdevNetDevice(virDomainHostdevDefPtr hostdev, char **linkdev, >> return ret; >> } >> >> +static int >> +virHostdevIsPCINetDevice(virDomainHostdevDefPtr hostdev) >> +{ >> + return hostdev->mode == VIR_DOMAIN_HOSTDEV_MODE_SUBSYS && >> + hostdev->source.subsys.type == VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_PCI && >> + hostdev->parent.type == VIR_DOMAIN_DEVICE_NET && >> + hostdev->parent.data.net; >> +} >> >> static int >> virHostdevNetConfigVirtPortProfile(const char *linkdev, int vf, >> @@ -481,10 +489,7 @@ virHostdevNetConfigRestore(virDomainHostdevDefPtr hostdev, >> /* This is only needed for PCI devices that have been defined >> * using <interface type='hostdev'>. For all others, it is a NOP. >> */ >> - if (hostdev->mode != VIR_DOMAIN_HOSTDEV_MODE_SUBSYS || >> - hostdev->source.subsys.type != VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_PCI || >> - hostdev->parent.type != VIR_DOMAIN_DEVICE_NET || >> - !hostdev->parent.data.net) >> + if (!virHostdevIsPCINetDevice(hostdev)) >> return 0; >> >> isvf = virHostdevIsVirtualFunction(hostdev); >> @@ -604,16 +609,11 @@ virHostdevPreparePCIDevices(virHostdevManagerPtr hostdev_mgr, >> * the network device, set the netdev config */ >> for (i = 0; i < nhostdevs; i++) { >> virDomainHostdevDefPtr hostdev = hostdevs[i]; >> - if (hostdev->mode != VIR_DOMAIN_HOSTDEV_MODE_SUBSYS) >> + if (!virHostdevIsPCINetDevice(hostdev)) >> continue; I think the above call should be moved from here into virHostdevNetConfigReplace(). This will made it more of a mirror of virHostdevNetConfigRestore() (which is what it is supposed to be). (either that or *remove* the check from virHostdevNetConfigRestore() and add it to its caller in the one place where it is appropriate - this would be slightly more efficient, since it's already checked in one of two places that virHostdevNetConfigRestore() is called). One or the other of those can be a followup patch, though... >> - if (hostdev->source.subsys.type != VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_PCI) >> - continue; >> - if (hostdev->parent.type == VIR_DOMAIN_DEVICE_NET && >> - hostdev->parent.data.net) { >> - if (virHostdevNetConfigReplace(hostdev, uuid, >> - hostdev_mgr->stateDir) < 0) { >> - goto resetvfnetconfig; >> - } >> + if (virHostdevNetConfigReplace(hostdev, uuid, >> + hostdev_mgr->stateDir) < 0) { >> + goto resetvfnetconfig; >> } >> last_processed_hostdev_vf = i; This *slightly* changes what is done, since the old code would have updated last_processed_hostdev_vf when encountering any hostdev device that was PCI but *not* the child of a NetDef. That's purely cosmetic though, because the restore operation done up to last_processed_hostdev_vf (on failure) is a NOP for any hostdev that *isn't* the child of a NetDef. tl;dr - this change is OK. >> } >> @@ -781,19 +781,20 @@ virHostdevReAttachPCIDevices(virHostdevManagerPtr hostdev_mgr, >> goto cleanup; >> } >> >> - /* Again 4 loops; mark all devices as inactive before reset >> + /* Here are 4 loops; mark all devices as inactive before reset I don't see the value of changing "Again" to "Here are". Both are equally pointless additions to the text of the comments. Instead, maybe take this chance to make the comments more understandable: "Loop through the assigned devices 4 times: 1) delete them all from activePCIHostdevs, 2) restore network config of SRIOV netdevs, 3) Do a PCI reset on each device, 4) reattach the devices to their host drivers (managed) or add them to inactivePCIHostdevs (!managed)." >> * them and reset all the devices before re-attach. >> * Attach mac and port profile parameters to devices >> */ >> + >> + /* Loop 1: delete the copy of the dev from pcidevs if it's used by >> + * other domain. Or delete it from activePCIHostDevs if it had >> + * been used by this domain. >> + */ "Loop 1: verify that each device in the hostdevs list really was in use by this domain, and remove them all from the activePCIHostdevs list." (Another side note on cleanups that should be done in a later patch: Looking at this code, I think virHostdevGetActivePCIHostDeviceList() (called just above to create the pcidevs list) should just be given the driver and domain name, then it could do the entire job of this loop, just moving each item from the activePCIHostdevs list onto the newly created list. That function is only called once anyway.) >> i = 0; >> while (i < virPCIDeviceListCount(pcidevs)) { >> virPCIDevicePtr dev = virPCIDeviceListGet(pcidevs, i); >> virPCIDevicePtr activeDev = NULL; >> >> - /* delete the copy of the dev from pcidevs if it's used by >> - * other domain. Or delete it from activePCIHostDevs if it had >> - * been used by this domain. >> - */ >> activeDev = virPCIDeviceListFind(hostdev_mgr->activePCIHostdevs, dev); >> if (activeDev) { >> const char *usedby_drvname; >> @@ -815,13 +816,33 @@ virHostdevReAttachPCIDevices(virHostdevManagerPtr hostdev_mgr, >> */ >> >> /* >> - * For SRIOV net host devices, unset mac and port profile before >> - * reset and reattach device >> + * Loop 2: For SRIOV net host devices used by this domain, unset mac and port profile before >> + * resetting and reattaching device Line is too long. Anyway, a better comment would be: "Loop 2: restore original network config of hostdevs that used <interface type='hostdev'>" >> */ >> - for (i = 0; i < nhostdevs; i++) >> - virHostdevNetConfigRestore(hostdevs[i], hostdev_mgr->stateDir, >> - oldStateDir); >> + for (i = 0; i < nhostdevs; i++) { >> + virDomainHostdevDefPtr hostdev = hostdevs[i]; >> >> + if (virHostdevIsPCINetDevice(hostdev)) { >> + virDomainHostdevSubsysPCIPtr pcisrc = &hostdev->source.subsys.u.pci; >> + virPCIDevicePtr dev = NULL; >> + dev = virPCIDeviceNew(pcisrc->addr.domain, pcisrc->addr.bus, >> + pcisrc->addr.slot, pcisrc->addr.function); >> + if (dev) { (This whole "virPCIDeviceNew()" thing would be completely unnecessary if virPCIDeviceListFind() just used virPCIDeviceAddress instead of virPCIDevice, and virDevicePCIAddress (what's in the hostdev entry) also used virPCIDeviceAddress instead of adding the domain/bus/slot/function separately. But again, that's not the fault of this patch; just nagging about the general mess...) >> + if (virPCIDeviceListFind(pcidevs, dev)) { >> + virHostdevNetConfigRestore(hostdev, hostdev_mgr->stateDir, >> + oldStateDir); >> + } >> + } else { >> + virErrorPtr err = virGetLastError(); >> + VIR_ERROR(_("Failed to new PCI device: %s"), >> + err ? err->message : _("unknown error")); >> + virResetError(err); I guess you're adding this VIR_ERROR() because that's what was done in Loop 3 in case of error. I think both are unnecessary and should be removed/not added. There are times when this function is being called as the result of some other error that already occurred, and we don't want to mask that in any way (which leads us to discussing the idea that whoever calls us to clean up after an error need to save their error before calling us, and restore it after we're done, but that's *another* story. I would say remove the else clause here, and we can remove it from loop 3 in a later cleanup patch. > + } > + virPCIDeviceFree(dev); > + } > + } This is getting more and more confusing. First we matched up the domains "hostdevs" list to the HostdevMgr's activePCIHostDevs list to get a list of the devices we want to do something with ("pcidevs"). This means that we know for a fact that everything on "pcidevs" is a device that was on the hostdevs list *and* that it was listed as actively in use by "some domain". Then we further narrow down that list by verifying that each device in pcidevs is listed as in use by *this* domain (and this driver, in case two drivers use the same domain name). So now we have a list of devices that we know are in use by our domain. And what does this new code do? It goes through every item in the hostdevs list of the domain to verify that it is on the pcidevs list, then performs an operation on it. But we already know that the pcidevs list only contains devices that are listed in hostdevs, so why not do the loop based on the pcidevs list? Oh, I get it - it's because we need the info from the hostdev entry (which is our only pointer to the parent NetDef) in order to perform the necessary NetConfigRestore). Bah. Okay, I accept defeat (for now :-P) >> + >> + /* Loop 3: reset pci device used by this domain */ "Loop 3: perform a PCI Reset on all devices" (it's been pretty well established by now that we're only operating on devices that were used by this domain) (By the way, I don't see the point of the extra VIR_ERROR in this loop either, although again that isn't the problem of this patch...) >> for (i = 0; i < virPCIDeviceListCount(pcidevs); i++) { >> virPCIDevicePtr dev = virPCIDeviceListGet(pcidevs, i); >> >> @@ -834,6 +855,9 @@ virHostdevReAttachPCIDevices(virHostdevManagerPtr hostdev_mgr, >> } >> } >> >> + /* Loop 4: reattach pci devices used by this domain >> + * and steal all the devices from pcidevs >> + */ "Loop 4: reattach devices to their host drivers (if managed) or place them on the inactive list (if not managed)" >> while (virPCIDeviceListCount(pcidevs) > 0) { >> virPCIDevicePtr dev = virPCIDeviceListStealIndex(pcidevs, 0); >> virHostdevReattachPCIDevice(dev, hostdev_mgr); At the end, it looks like this patch will do what was intended. I would say ACK on the following conditions: 1) splitting into 2 patches as John suggests. 2) changing the comments as suggested above 3) removing the VIR_ERROR() from loop 2. (At some point in the future, all of the code surrounding this patch should be rewritten, but today is not that day) -- libvir-list mailing list libvir-list@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/libvir-list