On 11/07/2013 02:28 PM, Laine Stump wrote: > On 11/07/2013 02:08 PM, Martin Kletzander wrote: >> On Thu, Nov 07, 2013 at 01:10:39PM +0200, Laine Stump wrote: >>> This resolves: >>> >>> https://bugzilla.redhat.com/show_bug.cgi?id=1025397 >>> >>> When libvirt creates the list of an SRIOV Physical Function's (PF) >>> Virtual Functions (VF), it assumes that the order of "virtfn*" links >>> returned by readdir() from the PF's sysfs directory is already in the >>> correct order. Experience has shown that this is not always the case. >>> >>> This results in 1) incorrect assumptions made by consumers of the >>> output of the virt_functions list of virsh nodedev-dumpxml, and 2) >>> setting MAC address and vlan tag on the wrong VF (since libvirt uses >>> netlink to set mac address and vlan tag, netlink requires the VF#, and >>> the function virPCIGetVirtualFunctionIndex() returns the wrong index >>> due to the improperly ordered VF list). See the bugzilla report for an >>> example of this improper behavior. >>> >>> The solution provided by this patch is for virPCIGetVirtualFunctions >>> to first gather all the "virtfn*" names from the PFs sysfs directory, >>> then allocate a virtual_functions array large enough to hold all >>> entries, and finally to create a device entry for each virtfn* name >>> and place it into the virtual_functions array at the proper index >>> (based on interpreting the value following "virtfn" in the name). >>> >>> Checks are introduced to ensure that 1) the virtfn list given in sysfs >>> is not sparse (ie, there are no indexes larger than the length of the >>> list), and that no two virtfn* entries decode to the same index. >>> --- >>> >>> Difference from V1: Based on jtomko's review, truncate trailing NULLs >>> in virtual_functions array, and actually check for NULLs in the middle >>> (a sparse array) and generate an error in that case, as we already >>> claimed to do in the commit message. >>> >>> (Sorry for the lack of inter-diff, but the only changes from V1 are >>> the addition of the lines starting with "truncate any trailing nulls" >>> and continuing until the next VIR_DEBUG.) >>> >>> src/util/virpci.c | 115 ++++++++++++++++++++++++++++++++++++++++++------------ >>> 1 file changed, 90 insertions(+), 25 deletions(-) >>> >> Disclaimer: I don't want this to sound rude (just that it does to me >> every time I read it), this is just a question that popped in my mind. > Not at all. Alternatives are always welcome. > >> Why don't we do it differently? If we have a while loop with a number >> starting from 1 and access(, F_OK)'ing '"virtfn%d", number' in that >> loop, we'll have all those VFs sorted. And if that's not nice enough, >> we should be able to read the number of VFs from the device's >> "/config" (in offset 0x10 [1]) and do a nice, clean for loop over >> already known filenames. >> >> Or am I missing something? I just Took that from "drivers/pci/iov.c", >> so this may be "Just Wrong". > No, that would work too. I was just looking at it from the inside rather > than the outside, so was only thinking of how to fix the current code > rather than how to rewrite it, and never considered that. > > My one concern would be if someone actually pushed a patch into the > kernel similar to the one suggested by the original reporter of this > problem (which put leading 0's on all the numbers in the virtfn > filenames), but that seems unlikely, as it isn't scalable (what happens > when you get > 100 VFs?). > > I like your idea enough to redo the patch using it. It may be tomorrow > before I get it posted though. I've posted a patch based on this suggestion (2 patches actually, due to a prerequisite that changes the type of num_virtual_functions from unsigned int to size_t), and I think the results are very good - rather than adding new code, about 30 lines were removed. The result is simpler even than the original, and ends up with correct results. -- libvir-list mailing list libvir-list@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/libvir-list