On Sat, Oct 08, 2011 at 09:10:02AM +0200, Benjamin Herrenschmidt wrote: > On Sat, 2011-10-08 at 11:09 +0800, Ram Pai wrote: > > BTW: Nish and myself explored the code some more. We find that the > > power > > architecture specific code does have the logic to check for resource > > conflicts. > > However it is missing the same logic for IOV BARs. > > > > arch/powerpc/kernel/pci-common.c:pcibios_allocate_resources() > > - for (idx = 0; idx <= PCI_ROM_RESOURCE; idx++) { > > + for (idx = 0; idx <= PCI_IOV_RESOURCE_END; idx++) { > > > > > > This along with some other small code changes will probably fix the > > problem at > > hand. Nish is verifying the changes. > > Beware that PCI_IOV_RESOURCE_END isn't defined without CONFIG_PCI_IOV. > > Note that the fact that we "miss" IOV here is yet another case of people > changing generic code & x86 together without bothering about other > architectures :-( > > In any case, the result here will simply be that the IOV resource will > be 0 based and will have IORESOURCE_UNSET set, which doesn't change the > problem afaik. It's cleaner to do that, but that all happens after > sriov_init() which has already done the damage. sriov_init() gets called first. It gathers the resource size and resource flags information. pci_bios_allocate_resource() uses that information to allocate resource and in case of power, verify the allocations done by the firmware. Since the code in pcibios_allocate_resources() does not validate the IOV BARs, the res->flags of those IOV BARS continue to stay enabled, which later leads to conflicts. The key here is to reset res->flags for any BAR when conflict is detected. > > We still also need to make sure pci_enable_device() doesn't rely on that > resource being properly allocated. hmm.. pci_enable_device()'s job is to enable all the resources used by the device. SRIOV BARs; after all, is a BAR belonging to the PF, not the VF. agree? > > Finally, keep in mind that the way we read that resource to begin with > is buggy (yet another piece of code written with only x86 in mind and > happily stuck into a generic place): > > We do a __pci_read_base() from the IOV code. This is broken on quite a > few non-x86 archs such as powerpc: > > For us, MMIO addresses from the CPU space aren't mapped 1:1 to > corresponding PCI addresses. They need to be translated. This is done at > probe time from a header quirk for the "normal" BARs. The generic code > doesn't do it for IOV. It should at the very least use > pcibios_bus_to_resource() to convert it. ok. will explore this. > > So even if the firmware -did- setup the IOV resource properly, the code > would get it wrong. > > Cheers, > Ben. -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html