On Fri, Oct 30, 2015 at 08:40:52AM -0700, Alexander Duyck wrote: >On 10/29/2015 08:48 PM, Wei Yang wrote: >>On Thu, Oct 29, 2015 at 05:22:54PM -0500, Bjorn Helgaas wrote: >>>ines: 115 >>> >>>From: Alexander Duyck <aduyck@xxxxxxxxxxxx> >>> >>>The enumeration path should leave NumVFs set to zero. But after >>>4449f079722c ("PCI: Calculate maximum number of buses required for VFs"), >>>we call virtfn_max_buses() in the enumeration path, which changes NumVFs. >>>This NumVFs change is visible via lspci and sysfs until a driver enables >>>SR-IOV. >>> >>>Iterate from TotalVFs down to zero so NumVFs is zero when we're finished >>>computing the maximum number of buses. Validate offset and stride in >>>the loop, so we can test it at every possible NumVFs setting. Rename >>>virtfn_max_buses() to compute_max_vf_buses() to hint that it does have a >>>side effect of updating iov->max_VF_buses. >>> >>>[bhelgaas: changelog, rename, allow numVF==1 && stride==0, rework loop, >>>reverse sense of error path] >>>Fixes: 4449f079722c ("PCI: Calculate maximum number of buses required for VFs") >>>Based-on-patch-by: Ethan Zhao <ethan.zhao@xxxxxxxxxx> >>>Signed-off-by: Alexander Duyck <aduyck@xxxxxxxxxxxx> >>>Signed-off-by: Bjorn Helgaas <bhelgaas@xxxxxxxxxx> >>>--- >>>drivers/pci/iov.c | 41 ++++++++++++++++++++++------------------- >>>1 file changed, 22 insertions(+), 19 deletions(-) >>> >>>diff --git a/drivers/pci/iov.c b/drivers/pci/iov.c >>>index 0cdb2d1..1b1acc2 100644 >>>--- a/drivers/pci/iov.c >>>+++ b/drivers/pci/iov.c >>>@@ -54,24 +54,29 @@ static inline void pci_iov_set_numvfs(struct pci_dev *dev, int nr_virtfn) >>> * The PF consumes one bus number. NumVFs, First VF Offset, and VF Stride >>> * determine how many additional bus numbers will be consumed by VFs. >>> * >>>- * Iterate over all valid NumVFs and calculate the maximum number of bus >>>- * numbers that could ever be required. >>>+ * Iterate over all valid NumVFs, validate offset and stride, and calculate >>>+ * the maximum number of bus numbers that could ever be required. >>> */ >>>-static inline u8 virtfn_max_buses(struct pci_dev *dev) >>>+static int compute_max_vf_buses(struct pci_dev *dev) >>>{ >>> struct pci_sriov *iov = dev->sriov; >>>- int nr_virtfn; >>>- u8 max = 0; >>>- int busnr; >>>+ int nr_virtfn, busnr, rc = 0; >>> >>>- for (nr_virtfn = 1; nr_virtfn <= iov->total_VFs; nr_virtfn++) { >>>+ for (nr_virtfn = iov->total_VFs; nr_virtfn; nr_virtfn--) { >>Hmm... I agree to iterate the nr_virtfn from total_VFs to 0, which keep the >>original logic, before my patch. >> >> >>> pci_iov_set_numvfs(dev, nr_virtfn); >>>+ if (!iov->offset || (nr_virtfn > 1 && !iov->stride)) { >> ^^^ >> >>Should be this? >> if (!iov->offset || (iov->total_VFs > 1 && !iov->stride)) > >The problem is the spec says stride and offset can change depending on the >value of NumVFs. We end up testing all values from TotalVFs to 1. The spec >states that stride is unused if NumVFs is 1, not TotalVFs. > Yes, I checked the SPEC again, and found this change is correct. >> >>>+ rc = -EIO; >>>+ goto out; >>>+ } >>>+ >>> busnr = pci_iov_virtfn_bus(dev, nr_virtfn - 1); >>>- if (busnr > max) >>>- max = busnr; >>>+ if (busnr > iov->max_VF_buses) >>>+ iov->max_VF_buses = busnr; >>> } >>> >>>- return max; >>>+out: >>>+ pci_iov_set_numvfs(dev, 0); >>>+ return rc; >>>} >>> >>>static struct pci_bus *virtfn_add_bus(struct pci_bus *bus, int busnr) >>>@@ -384,7 +389,7 @@ static int sriov_init(struct pci_dev *dev, int pos) >>> int rc; >>> int nres; >>> u32 pgsz; >>>- u16 ctrl, total, offset, stride; >>>+ u16 ctrl, total; >>> struct pci_sriov *iov; >>> struct resource *res; >>> struct pci_dev *pdev; >>>@@ -410,11 +415,6 @@ static int sriov_init(struct pci_dev *dev, int pos) >>> >>>found: >>> pci_write_config_word(dev, pos + PCI_SRIOV_CTRL, ctrl); >>>- pci_write_config_word(dev, pos + PCI_SRIOV_NUM_VF, 0); >>>- pci_read_config_word(dev, pos + PCI_SRIOV_VF_OFFSET, &offset); >>>- pci_read_config_word(dev, pos + PCI_SRIOV_VF_STRIDE, &stride); >>>- if (!offset || (total > 1 && !stride)) >>>- return -EIO; >>> >>Original code will return when it found this condition, so that we don't need >>to bother to do those initialization and then fall back. >> >>So I suggest to keep it here. > >The problem is this code isn't valid as per the spec. The offset and stride >are considered unused when numvfs is 0. That is why this has to be dropped. > >>> pci_read_config_word(dev, pos + PCI_SRIOV_TOTAL_VF, &total); >>> if (!total) >>>@@ -456,8 +456,6 @@ found: >>> iov->nres = nres; >>> iov->ctrl = ctrl; >>> iov->total_VFs = total; >>>- iov->offset = offset; >>>- iov->stride = stride; >>> iov->pgsz = pgsz; >>> iov->self = dev; >>> pci_read_config_dword(dev, pos + PCI_SRIOV_CAP, &iov->cap); >>>@@ -474,10 +472,15 @@ found: >>> >>> dev->sriov = iov; >>> dev->is_physfn = 1; >>>- iov->max_VF_buses = virtfn_max_buses(dev); >>>+ rc = compute_max_vf_buses(dev); >>>+ if (rc) >>>+ goto fail_max_buses; >>> >>> return 0; >>> >>>+fail_max_buses: >>>+ dev->sriov = NULL; >>The dev->sriov is allocated with kzalloc(), seems we forget to release it? > >No, if you check at the end of the function we release it via a kfree(iov). > Right. >>>+ dev->is_physfn = 0; >>>failed: >>> for (i = 0; i < PCI_SRIOV_NUM_BARS; i++) { >>> res = &dev->resource[i + PCI_IOV_R -- Richard Yang Help you, Help me -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html