On Wed, May 29, 2019 at 04:04:27PM -0700, Raj, Ashok wrote: > On Wed, May 29, 2019 at 05:57:14PM -0500, Bjorn Helgaas wrote: > > On Mon, May 06, 2019 at 10:20:03AM -0700, sathyanarayanan.kuppuswamy@xxxxxxxxxxxxxxx wrote: > > > From: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@xxxxxxxxxxxxxxx> > > > > > > When IOMMU tries to enable PRI for VF device in > > > iommu_enable_dev_iotlb(), it always fails because PRI support for PCIe > > > VF device is currently broken in PCIE driver. Current implementation > > > expects the given PCIe device (PF & VF) to implement PRI capability > > > before enabling the PRI support. But this assumption is incorrect. As > > > per PCIe spec r4.0, sec 9.3.7.11, all VFs associated with PF can only > > > use the Page Request Interface (PRI) of the PF and not implement it. > > > Hence we need to create exception for handling the PRI support for PCIe > > > VF device. > > > > > > Since PRI is shared between PF/VF devices, following rules should apply. > > > > > > 1. Enable PRI in VF only if its already enabled in PF. > > > 2. When enabling/disabling PRI for VF, instead of configuring the > > > registers just increase/decrease the usage count (pri_ref_cnt) of PF. > > > 3. Disable PRI in PF only if pr_ref_cnt is zero. > > > > s/pr_ref_cnt/pri_ref_cnt/ > > > > > Cc: Ashok Raj <ashok.raj@xxxxxxxxx> > > > Cc: Keith Busch <keith.busch@xxxxxxxxx> > > > Suggested-by: Ashok Raj <ashok.raj@xxxxxxxxx> > > > Signed-off-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@xxxxxxxxxxxxxxx> > > > --- > > > drivers/pci/ats.c | 53 +++++++++++++++++++++++++++++++++++++++++++-- > > > include/linux/pci.h | 1 + > > > 2 files changed, 52 insertions(+), 2 deletions(-) > > > > > > diff --git a/drivers/pci/ats.c b/drivers/pci/ats.c > > > index 97c08146534a..5582e5d83a3f 100644 > > > --- a/drivers/pci/ats.c > > > +++ b/drivers/pci/ats.c > > > @@ -181,12 +181,39 @@ int pci_enable_pri(struct pci_dev *pdev, u32 reqs) > > > u16 control, status; > > > u32 max_requests; > > > int pos; > > > + struct pci_dev *pf; > > > > > > if (WARN_ON(pdev->pri_enabled)) > > > return -EBUSY; > > > > > > pos = pci_find_ext_capability(pdev, PCI_EXT_CAP_ID_PRI); > > > - if (!pos) > > > + > > > + if (pdev->is_virtfn) { > > > + /* > > > + * Per PCIe r4.0, sec 9.3.7.11, VF must not implement PRI > > > + * Capability. > > > + */ > > > + if (pos) { > > > + dev_err(&pdev->dev, "VF must not implement PRI"); > > > + return -EINVAL; > > > + } > > > > This seems gratuitous. It finds implementation errors, but since we > > correctly use the PF here anyway, it doesn't *need* to prevent PRI on > > the VF from working. > > > > I think you should just have: > > > > if (pdev->is_virtfn) { > > pf = pci_physfn(pdev); > > if (!pf->pri_enabled) > > return -EINVAL; > > This would be incorrect. Since if we never did any bind_mm to the PF > PRI would not have been enabled. Currently this is done in the IOMMU > driver, and not in the device driver. This is functionally the same as the original patch, only omitting the "VF must not implement PRI" check. > I suppose we should enable PF capability if its not enabled. Same > comment would be applicable for PASID as well. Operating on a device other than the one the driver owns opens the issue of mutual exclusion and races, so would require careful scrutiny. Are PRI/PASID things that could be *always* enabled for the PF at enumeration-time, or do we have to wait until a driver claims the VF? If the latter, are there coordination issues between drivers of different VFs? > > pdev->pri_enabled = 1; > > atomic_inc(&pf->pri_ref_cnt); > > } > > > > pos = pci_find_ext_capability(pdev, PCI_EXT_CAP_ID_PRI); > > if (!pos) > > return -EINVAL; > > > > > + pf = pci_physfn(pdev); > > > + > > > + /* If VF config does not match with PF, return error */ > > > + if (!pf->pri_enabled) > > > + return -EINVAL; > > > + > > > + pdev->pri_reqs_alloc = pf->pri_reqs_alloc; > > > > Is there any point in setting vf->pri_reqs_alloc? I don't think it's > > used for anything since pri_reqs_alloc is only used to write the PF > > capability, and we only do that for the PF. > > > > > + pdev->pri_enabled = 1; > > > + > > > + /* Increment PF PRI refcount */ > > > > Superfluous comment, since that's exactly what the code says. It's > > always good when the code is so clear that it doesn't require comments :) > > > > > + atomic_inc(&pf->pri_ref_cnt); > > > + > > > + return 0; > > > + } > > > + > > > + if (pdev->is_physfn && !pos) > > > return -EINVAL; > > > > > > pci_read_config_word(pdev, pos + PCI_PRI_STATUS, &status); > > > @@ -202,7 +229,6 @@ int pci_enable_pri(struct pci_dev *pdev, u32 reqs) > > > pci_write_config_word(pdev, pos + PCI_PRI_CTRL, control); > > > > > > pdev->pri_enabled = 1; > > > - > > > return 0;