Hi forget this as approach works using kernel-patch. Correctly was mentioned that the approach using leading-zeroes was a try to fix something happened in other place. Haven't get intention to libvirt side after that, wrote untested patch for this symptom. It's is not complete - no more error checking's. But it should pre-select only "virtfn" entries to be handled in while-loop, and after that scan through them in increasing numerical order. Still, here it is : < /* routine to select only "virtfn" -entries */ < static int < virtfn_select(const struct dirent *entry) < { < return (strncmp(entry->d_name,"virtfn", 6) == 0) ? 1 : 0; < } < 2401a2395,2396 > DIR *dir = NULL; > struct dirent *entry = NULL; 2404,2406d2398 < struct dirent **namelist; < int entry_count = 0; < int current_entry = 0; 2414,2415c2406,2407 < struct stat sb; < if ((stat(sysfs_path, &sb) == -1) || ((sb.st_mode & S_IFMT) != S_IFDIR)) { --- > dir = opendir(sysfs_path); > if (dir == NULL) { 2423,2425c2415,2416 < entry_count = scandir(sysfs_path, &namelist, virtfn_select, alphasort); < < while ( current_entry < entry_count ) { --- > while ((entry = readdir(dir))) { > if (STRPREFIX(entry->d_name, "virtfn")) { 2428c2419 < if (virBuildPath(&device_link, sysfs_path, namelist[ current_entry ]->d_name ) == -1) { --- > if (virBuildPath(&device_link, sysfs_path, entry->d_name) == -1) { 2432c2423 < current_entry ++; --- > 2453a2445 > } 2460,2462c2452,2453 < for ( i = 0; i < entry_count; i ++) < free( namelist[ i ] ); < free( namelist ); --- > if (dir) > closedir(dir); On Thu, Nov 7, 2013 at 2:32 PM, Ján Tomko <jtomko@xxxxxxxxxx> wrote: > On 11/07/2013 12:10 PM, Laine Stump wrote: >> This resolves: >> >> https://bugzilla.redhat.com/show_bug.cgi?id=1025397 >> >> When libvirt creates the list of an SRIOV Physical Function's (PF) >> Virtual Functions (VF), it assumes that the order of "virtfn*" links >> returned by readdir() from the PF's sysfs directory is already in the >> correct order. Experience has shown that this is not always the case. >> >> This results in 1) incorrect assumptions made by consumers of the >> output of the virt_functions list of virsh nodedev-dumpxml, and 2) >> setting MAC address and vlan tag on the wrong VF (since libvirt uses >> netlink to set mac address and vlan tag, netlink requires the VF#, and >> the function virPCIGetVirtualFunctionIndex() returns the wrong index >> due to the improperly ordered VF list). See the bugzilla report for an >> example of this improper behavior. >> >> The solution provided by this patch is for virPCIGetVirtualFunctions >> to first gather all the "virtfn*" names from the PFs sysfs directory, >> then allocate a virtual_functions array large enough to hold all >> entries, and finally to create a device entry for each virtfn* name >> and place it into the virtual_functions array at the proper index >> (based on interpreting the value following "virtfn" in the name). >> >> Checks are introduced to ensure that 1) the virtfn list given in sysfs >> is not sparse (ie, there are no indexes larger than the length of the >> list), and that no two virtfn* entries decode to the same index. >> --- >> >> Difference from V1: Based on jtomko's review, truncate trailing NULLs >> in virtual_functions array, and actually check for NULLs in the middle >> (a sparse array) and generate an error in that case, as we already >> claimed to do in the commit message. >> >> (Sorry for the lack of inter-diff, but the only changes from V1 are >> the addition of the lines starting with "truncate any trailing nulls" >> and continuing until the next VIR_DEBUG.) >> >> src/util/virpci.c | 115 ++++++++++++++++++++++++++++++++++++++++++------------ >> 1 file changed, 90 insertions(+), 25 deletions(-) >> >> diff --git a/src/util/virpci.c b/src/util/virpci.c >> index 148631f..f744c8b 100644 >> --- a/src/util/virpci.c >> +++ b/src/util/virpci.c >> @@ -2379,6 +2379,8 @@ virPCIGetPhysicalFunction(const char *vf_sysfs_path, >> return ret; >> } >> >> +static const char *virtfnPrefix = "virtfn"; >> + >> /* >> * Returns virtual functions of a physical function >> */ >> @@ -2393,6 +2395,8 @@ virPCIGetVirtualFunctions(const char *sysfs_path, >> struct dirent *entry = NULL; >> char *device_link = NULL; >> char errbuf[64]; >> + char **virtfnNames = NULL; >> + size_t nVirtfnNames = 0; >> >> VIR_DEBUG("Attempting to get SR IOV virtual functions for device" >> "with sysfs path '%s'", sysfs_path); >> @@ -2411,51 +2415,112 @@ virPCIGetVirtualFunctions(const char *sysfs_path, >> >> while ((entry = readdir(dir))) { >> if (STRPREFIX(entry->d_name, "virtfn")) { >> - virPCIDeviceAddress *config_addr = NULL; >> + char *tempName; >> >> - if (virBuildPath(&device_link, sysfs_path, entry->d_name) == -1) { >> - virReportOOMError(); >> + /* save these to sort into virtual_functions array */ >> + if (VIR_STRDUP(tempName, entry->d_name) < 0) >> + goto error; >> + if (VIR_APPEND_ELEMENT(virtfnNames, nVirtfnNames, tempName) < 0) { >> + VIR_FREE(tempName); >> goto error; >> } >> + } >> + } > > Wouldn't it be easier to just count the entries beginning with "virtfn" and > then go from 0 to n? (This would need a special return value for > virPCIGetDeviceAddressFromSysfsLink if the link does not exist, if the entries > in the directory can truly be sparse, or a virtfn-poorly-chosen-name appears) > >> >> - VIR_DEBUG("Number of virtual functions: %d", >> - *num_virtual_functions); >> + /* pre-allocate because we will be filling in out of order */ >> + if (nVirtfnNames && VIR_ALLOC_N(*virtual_functions, nVirtfnNames) < 0) >> + goto error; >> + *num_virtual_functions = nVirtfnNames; >> >> - if (virPCIGetDeviceAddressFromSysfsLink(device_link, >> - &config_addr) != >> - SRIOV_FOUND) { >> - VIR_WARN("Failed to get SRIOV function from device " >> - "link '%s'", device_link); >> - VIR_FREE(device_link); >> - continue; >> - } >> + for (i = 0; i < nVirtfnNames; i++) { >> + virPCIDeviceAddress *config_addr = NULL; >> + unsigned int virtfnIndex; >> >> - if (VIR_REALLOC_N(*virtual_functions, >> - *num_virtual_functions + 1) < 0) { >> - VIR_FREE(config_addr); >> - goto error; >> - } >> + if (virBuildPath(&device_link, sysfs_path, virtfnNames[i]) < 0) { >> + virReportOOMError(); >> + goto error; >> + } >> + >> + if (virStrToLong_ui(virtfnNames[i] + strlen(virtfnPrefix), >> + 0, 10, &virtfnIndex) < 0) { >> + virReportError(VIR_ERR_INTERNAL_ERROR, >> + _("Invalid virtual function link name '%s' " >> + "in physical function directory '%s'"), >> + virtfnNames[i], sysfs_path); >> + goto error; >> + } > > This rejects "virtfn-poor-name-choice" (which is fine by me, it's really a > poor choice). > >> + >> + if (virtfnIndex >= nVirtfnNames) { >> + virReportError(VIR_ERR_INTERNAL_ERROR, >> + _("Virtual function link name '%s' larger than " >> + "total count of virtual functions %zu " >> + "in physical function directory '%s'"), >> + virtfnNames[i], nVirtfnNames, sysfs_path); >> + goto error; >> + } >> + > >> + if ((*virtual_functions)[virtfnIndex]) { >> + virReportError(VIR_ERR_INTERNAL_ERROR, >> + _("Virtual function link name '%s' " >> + "generates duplicate index %zu " >> + "in physical function directory '%s'"), >> + virtfnNames[i], nVirtfnNames, sysfs_path); >> + goto error; >> + } > > Can there really be two entries in a directory with the same name? > >> >> - (*virtual_functions)[*num_virtual_functions] = config_addr; >> - (*num_virtual_functions)++; >> + if (virPCIGetDeviceAddressFromSysfsLink(device_link, &config_addr) != >> + SRIOV_FOUND) { > > This also ignores OOM errors. > >> + VIR_WARN("Failed to get SRIOV function from device link '%s'", >> + device_link); >> VIR_FREE(device_link); >> + continue; >> } > >> + >> + VIR_DEBUG("Found virtual function %d", virtfnIndex); >> + (*virtual_functions)[virtfnIndex] = config_addr; >> + VIR_FREE(device_link); >> } > > > >> + /* truncate any trailing nulls due to files having names starting >> + * with "virtfn" which weren't actually a link to a virtual >> + * function. >> + */ >> + while (*num_virtual_functions && >> + !(*virtual_functions)[(*num_virtual_functions) - 1]) >> + (*num_virtual_functions)--; >> + >> + /* failure of realloc when shrinking is safe */ >> + ignore_value(VIR_REALLOC_N(*virtual_functions, *num_virtual_functions)); >> + > > This treats contiguous missing entries at the end differently than the missing > ones at the beginning, are the virtfns with larger indexes less important? :) > >> + /* if there are any NULLs left in the array, fail and log an error >> + * - the virtual function array cannot be sparse. >> + */ >> + for (i = 0; i < *num_virtual_functions; i++) { >> + if (!(*virtual_functions)[*num_virtual_functions]) { >> + virReportError(VIR_ERR_INTERNAL_ERROR, >> + _("Missing virtual function link for VF %zu " >> + "in physical function directory '%s'"), >> + i, sysfs_path); > > In other words, virPCIGetDeviceAddressFromSysfsLink failed. If we really need > to ignore entries that aren't a symlink with a PCI address, the array has to > be sparse and we need to fix all the users :( > >> + goto error; >> + } >> + } >> + >> + VIR_DEBUG("Number of virtual functions: %d", *num_virtual_functions); > > Jan > > > -- > libvir-list mailing list > libvir-list@xxxxxxxxxx > https://www.redhat.com/mailman/listinfo/libvir-list -- --------------------------------- Niilo Minkkinen niilo.minkkinen@xxxxxx “Brittiläisen tutkimuksen mukaan pyöräily pidentää elinajan odotetta peräti 20 kertaa enemmän kuin onnettomuusriskin kohoaminen sitä lyhentää! (BMA, British Medical Association. Cycling: towards health & safety. Oxford University Press, 1992). --------------------------------- -- libvir-list mailing list libvir-list@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/libvir-list