On Tue, Mar 10, 2015 at 09:55:19PM -0500, Bjorn Helgaas wrote: >On Wed, Mar 04, 2015 at 11:01:24AM +0800, Wei Yang wrote: >> On Tue, Feb 24, 2015 at 03:00:37AM -0600, Bjorn Helgaas wrote: >> >On Tue, Feb 24, 2015 at 02:34:57AM -0600, Bjorn Helgaas wrote: >> >> From: Wei Yang <weiyang@xxxxxxxxxxxxxxxxxx> >> >> >> >> On PowerNV platform, resource position in M64 implies the PE# the resource >> >> belongs to. In some cases, adjustment of a resource is necessary to locate >> >> it to a correct position in M64. >> >> >> >> Add pnv_pci_vf_resource_shift() to shift the 'real' PF IOV BAR address >> >> according to an offset. >> >> >> >> [bhelgaas: rework loops, rework overlap check, index resource[] >> >> conventionally, remove pci_regs.h include, squashed with next patch] >> >> Signed-off-by: Wei Yang <weiyang@xxxxxxxxxxxxxxxxxx> >> >> Signed-off-by: Bjorn Helgaas <bhelgaas@xxxxxxxxxx> >> > >> >... >> > >> >> +#ifdef CONFIG_PCI_IOV >> >> +static int pnv_pci_vf_resource_shift(struct pci_dev *dev, int offset) >> >> +{ >> >> + struct pci_dn *pdn = pci_get_pdn(dev); >> >> + int i; >> >> + struct resource *res, res2; >> >> + resource_size_t size; >> >> + u16 vf_num; >> >> + >> >> + if (!dev->is_physfn) >> >> + return -EINVAL; >> >> + >> >> + /* >> >> + * "offset" is in VFs. The M64 windows are sized so that when they >> >> + * are segmented, each segment is the same size as the IOV BAR. >> >> + * Each segment is in a separate PE, and the high order bits of the >> >> + * address are the PE number. Therefore, each VF's BAR is in a >> >> + * separate PE, and changing the IOV BAR start address changes the >> >> + * range of PEs the VFs are in. >> >> + */ >> >> + vf_num = pdn->vf_pes; >> >> + for (i = 0; i < PCI_SRIOV_NUM_BARS; i++) { >> >> + res = &dev->resource[i + PCI_IOV_RESOURCES]; >> >> + if (!res->flags || !res->parent) >> >> + continue; >> >> + >> >> + if (!pnv_pci_is_mem_pref_64(res->flags)) >> >> + continue; >> >> + >> >> + /* >> >> + * The actual IOV BAR range is determined by the start address >> >> + * and the actual size for vf_num VFs BAR. This check is to >> >> + * make sure that after shifting, the range will not overlap >> >> + * with another device. >> >> + */ >> >> + size = pci_iov_resource_size(dev, i + PCI_IOV_RESOURCES); >> >> + res2.flags = res->flags; >> >> + res2.start = res->start + (size * offset); >> >> + res2.end = res2.start + (size * vf_num) - 1; >> >> + >> >> + if (res2.end > res->end) { >> >> + dev_err(&dev->dev, "VF BAR%d: %pR would extend past %pR (trying to enable %d VFs shifted by %d)\n", >> >> + i, &res2, res, vf_num, offset); >> >> + return -EBUSY; >> >> + } >> >> + } >> >> + >> >> + for (i = 0; i < PCI_SRIOV_NUM_BARS; i++) { >> >> + res = &dev->resource[i + PCI_IOV_RESOURCES]; >> >> + if (!res->flags || !res->parent) >> >> + continue; >> >> + >> >> + if (!pnv_pci_is_mem_pref_64(res->flags)) >> >> + continue; >> >> + >> >> + size = pci_iov_resource_size(dev, i + PCI_IOV_RESOURCES); >> >> + res2 = *res; >> >> + res->start += size * offset; >> > >> >I'm still not happy about this fiddling with res->start. >> > >> >Increasing res->start means that in principle, the "size * offset" bytes >> >that we just removed from res are now available for allocation to somebody >> >else. I don't think we *will* give that space to anything else because of >> >the alignment restrictions you're enforcing, but "res" now doesn't >> >correctly describe the real resource map. >> > >> >Would you be able to just update the BAR here while leaving the struct >> >resource alone? In that case, it would look a little funny that lspci >> >would show a BAR value in the middle of the region in /proc/iomem, but >> >the /proc/iomem region would be more correct. >> >> Bjorn, >> >> I did some tests, while the result is not good. >> >> What I did is still write the shifted resource address to the device by >> pci_update_resource(), but I revert the res->start to the original one. If >> this step is not correct, please let me know. >> >> This can't work since after we revert the res->start, those VFs will be given >> resources from res->start instead of (res->start + offset * size). This is not >> what we expect. > >Hmm, yes, I suppose we'd have to have a hook in pci_bus_alloc_from_region() >or something. That's getting a little messy. I still don't like messing >with the resource after it's in the resource tree, but I don't have a >better idea right now. So let's just go with what you have. > Thanks :-) I would state this in the change log and add a comment in the code to note this down. Hope this would be a little helpful. >> >> + >> >> + dev_info(&dev->dev, "VF BAR%d: %pR shifted to %pR (enabling %d VFs shifted by %d)\n", >> >> + i, &res2, res, vf_num, offset); >> >> + pci_update_resource(dev, i + PCI_IOV_RESOURCES); >> >> + } >> >> + pdn->max_vfs -= offset; >> >> + return 0; >> >> +} >> >> +#endif /* CONFIG_PCI_IOV */ >> >> -- >> Richard Yang >> Help you, Help me >> >> -- >> To unsubscribe from this list: send the line "unsubscribe linux-pci" in >> the body of a message to majordomo@xxxxxxxxxxxxxxx >> More majordomo info at http://vger.kernel.org/majordomo-info.html -- Richard Yang Help you, Help me -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html