On Wed, Mar 04, 2015 at 11:01:24AM +0800, Wei Yang wrote: > On Tue, Feb 24, 2015 at 03:00:37AM -0600, Bjorn Helgaas wrote: > >On Tue, Feb 24, 2015 at 02:34:57AM -0600, Bjorn Helgaas wrote: > >> From: Wei Yang <weiyang@xxxxxxxxxxxxxxxxxx> > >> > >> On PowerNV platform, resource position in M64 implies the PE# the resource > >> belongs to. In some cases, adjustment of a resource is necessary to locate > >> it to a correct position in M64. > >> > >> Add pnv_pci_vf_resource_shift() to shift the 'real' PF IOV BAR address > >> according to an offset. > >> > >> [bhelgaas: rework loops, rework overlap check, index resource[] > >> conventionally, remove pci_regs.h include, squashed with next patch] > >> Signed-off-by: Wei Yang <weiyang@xxxxxxxxxxxxxxxxxx> > >> Signed-off-by: Bjorn Helgaas <bhelgaas@xxxxxxxxxx> > > > >... > > > >> +#ifdef CONFIG_PCI_IOV > >> +static int pnv_pci_vf_resource_shift(struct pci_dev *dev, int offset) > >> +{ > >> + struct pci_dn *pdn = pci_get_pdn(dev); > >> + int i; > >> + struct resource *res, res2; > >> + resource_size_t size; > >> + u16 vf_num; > >> + > >> + if (!dev->is_physfn) > >> + return -EINVAL; > >> + > >> + /* > >> + * "offset" is in VFs. The M64 windows are sized so that when they > >> + * are segmented, each segment is the same size as the IOV BAR. > >> + * Each segment is in a separate PE, and the high order bits of the > >> + * address are the PE number. Therefore, each VF's BAR is in a > >> + * separate PE, and changing the IOV BAR start address changes the > >> + * range of PEs the VFs are in. > >> + */ > >> + vf_num = pdn->vf_pes; > >> + for (i = 0; i < PCI_SRIOV_NUM_BARS; i++) { > >> + res = &dev->resource[i + PCI_IOV_RESOURCES]; > >> + if (!res->flags || !res->parent) > >> + continue; > >> + > >> + if (!pnv_pci_is_mem_pref_64(res->flags)) > >> + continue; > >> + > >> + /* > >> + * The actual IOV BAR range is determined by the start address > >> + * and the actual size for vf_num VFs BAR. This check is to > >> + * make sure that after shifting, the range will not overlap > >> + * with another device. > >> + */ > >> + size = pci_iov_resource_size(dev, i + PCI_IOV_RESOURCES); > >> + res2.flags = res->flags; > >> + res2.start = res->start + (size * offset); > >> + res2.end = res2.start + (size * vf_num) - 1; > >> + > >> + if (res2.end > res->end) { > >> + dev_err(&dev->dev, "VF BAR%d: %pR would extend past %pR (trying to enable %d VFs shifted by %d)\n", > >> + i, &res2, res, vf_num, offset); > >> + return -EBUSY; > >> + } > >> + } > >> + > >> + for (i = 0; i < PCI_SRIOV_NUM_BARS; i++) { > >> + res = &dev->resource[i + PCI_IOV_RESOURCES]; > >> + if (!res->flags || !res->parent) > >> + continue; > >> + > >> + if (!pnv_pci_is_mem_pref_64(res->flags)) > >> + continue; > >> + > >> + size = pci_iov_resource_size(dev, i + PCI_IOV_RESOURCES); > >> + res2 = *res; > >> + res->start += size * offset; > > > >I'm still not happy about this fiddling with res->start. > > > >Increasing res->start means that in principle, the "size * offset" bytes > >that we just removed from res are now available for allocation to somebody > >else. I don't think we *will* give that space to anything else because of > >the alignment restrictions you're enforcing, but "res" now doesn't > >correctly describe the real resource map. > > > >Would you be able to just update the BAR here while leaving the struct > >resource alone? In that case, it would look a little funny that lspci > >would show a BAR value in the middle of the region in /proc/iomem, but > >the /proc/iomem region would be more correct. > > Bjorn, > > I did some tests, while the result is not good. > > What I did is still write the shifted resource address to the device by > pci_update_resource(), but I revert the res->start to the original one. If > this step is not correct, please let me know. > > This can't work since after we revert the res->start, those VFs will be given > resources from res->start instead of (res->start + offset * size). This is not > what we expect. Hmm, yes, I suppose we'd have to have a hook in pci_bus_alloc_from_region() or something. That's getting a little messy. I still don't like messing with the resource after it's in the resource tree, but I don't have a better idea right now. So let's just go with what you have. > >> + > >> + dev_info(&dev->dev, "VF BAR%d: %pR shifted to %pR (enabling %d VFs shifted by %d)\n", > >> + i, &res2, res, vf_num, offset); > >> + pci_update_resource(dev, i + PCI_IOV_RESOURCES); > >> + } > >> + pdn->max_vfs -= offset; > >> + return 0; > >> +} > >> +#endif /* CONFIG_PCI_IOV */ > > -- > Richard Yang > Help you, Help me > > -- > To unsubscribe from this list: send the line "unsubscribe linux-pci" in > the body of a message to majordomo@xxxxxxxxxxxxxxx > More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html