On Fri, Oct 21, 2016 at 11:50:34AM -0500, Bjorn Helgaas wrote: >Hi Gavin, > >On Fri, Sep 30, 2016 at 09:47:50AM +1000, Gavin Shan wrote: >> pci_update_resource() might be called to update (shift) IOV BARs >> in PPC PowerNV specific pcibios_sriov_enable() when enabling PF's >> SRIOV capability. At that point, the PF have been functional if >> the SRIOV is enabled through sysfs entry "sriov_numvfs". The PF's >> memory decoding (0x2 in PCI_COMMAND) shouldn't be disabled when >> updating its IOV BARs with pci_update_resource(). Otherwise, we >> receives EEH error caused by MMIO access to PF's memory BARs during >> the window when PF's memory decoding is disabled. > >The fact that you get EEH errors is irrelevant. We can't write code >simply to avoid errors -- we have to write code to make the system >work correctly. > >I do not want to add a "mmio_force_on" parameter to >pci_update_resource(). That puts the burden on the caller to >understand this subtle issue. If the caller passes mmio_force_on=1 >when it shouldn't, things will almost always work, but once in a blue >moon a half-updated BAR will conflict with some other device in the >system, and we'll have an unreproducible, undebuggable crash. > Bjorn, thanks for your comments. Yes, the EEH error was caused by MMIO access to PF's normal BARs, not VF BARs. Yeah, I also had the conern that adding parameter to pci_update_resource() increases the visible complexity to the caller of the function. >What you do need is an explanation of why it's safe to non-atomically >update a VF BARx in the SR-IOV capability. I think this probably >involves the VF MSE bit, and you probably have to either disable VFs >completely or clear the MSE bit while you're updating the VF BARx. We >should be able to do this inside pci_update_resource() without >changing the interface. > Yes, It's what PATCH[1/2] does: (PCI_SRIOV_CTRL_VFE | PCI_SRIOV_CTRL_MSE) are set after VF BARs are updated with pci_update_resource() in this PPC specific scenario. There are other two situations where the IOV BARs are updated: PCI resource resizing and allocation during system booting or hot adding PF. The VF shouldn't be enabled in both cases when updating IOV BARs. I think you suggest to add check in pci_update_resource(): Don't disable PF's memory decoding when updating VF BARs. I will send updated revision if it's what you want. /* * The PF might be functional when updating its IOV BARs. So PF's * memory decoding shouldn't be disabled when updating its IOV BARs. */ disable = (res->flags & IORESOURCE_MEM_64) && !dev->mmio_always_on; #ifdef CONFIG_PCI_IOV disable &= !(resno >= PCI_IOV_RESOURCES && resno <= PCI_IOV_RESOURCE_END); #endif if (disable) { pci_read_config_word(dev, PCI_COMMAND, &cmd); pci_write_config_word(dev, PCI_COMMAND, cmd & ~PCI_COMMAND_MEMORY); } Thanks, Gavin >> sriov_numvfs_store >> pdev->driver->sriov_configure >> mlx5_core_sriov_configure >> pci_enable_sriov >> sriov_enable >> pcibios_sriov_enable >> pnv_pci_sriov_enable >> pnv_pci_vf_resource_shift >> pci_update_resource >> >> This doesn't change the PF's memory decoding in the path by introducing >> additional parameter (@mmio_force_on) to pci_update_resource() in >> the above path. > >> >> Reported-by: Carol Soto <clsoto@xxxxxxxxxx> >> Signed-off-by: Gavin Shan <gwshan@xxxxxxxxxxxxxxxxxx> >> Tested-by: Carol Soto <clsoto@xxxxxxxxxx> >> --- >> arch/powerpc/platforms/powernv/pci-ioda.c | 2 +- >> drivers/pci/iov.c | 2 +- >> drivers/pci/pci.c | 2 +- >> drivers/pci/setup-res.c | 9 +++++---- >> include/linux/pci.h | 2 +- >> 5 files changed, 9 insertions(+), 8 deletions(-) >> >> diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c b/arch/powerpc/platforms/powernv/pci-ioda.c >> index 38a5c65..f4ccc5b 100644 >> --- a/arch/powerpc/platforms/powernv/pci-ioda.c >> +++ b/arch/powerpc/platforms/powernv/pci-ioda.c >> @@ -1006,7 +1006,7 @@ static int pnv_pci_vf_resource_shift(struct pci_dev *dev, int offset) >> dev_info(&dev->dev, "VF BAR%d: %pR shifted to %pR (%sabling %d VFs shifted by %d)\n", >> i, &res2, res, (offset > 0) ? "En" : "Dis", >> num_vfs, offset); >> - pci_update_resource(dev, i + PCI_IOV_RESOURCES); >> + pci_update_resource(dev, i + PCI_IOV_RESOURCES, true); >> } >> return 0; >> } >> diff --git a/drivers/pci/iov.c b/drivers/pci/iov.c >> index f1343f0..db31966 100644 >> --- a/drivers/pci/iov.c >> +++ b/drivers/pci/iov.c >> @@ -511,7 +511,7 @@ static void sriov_restore_state(struct pci_dev *dev) >> return; >> >> for (i = PCI_IOV_RESOURCES; i <= PCI_IOV_RESOURCE_END; i++) >> - pci_update_resource(dev, i); >> + pci_update_resource(dev, i, false); >> >> pci_write_config_dword(dev, iov->pos + PCI_SRIOV_SYS_PGSIZE, iov->pgsz); >> pci_iov_set_numvfs(dev, iov->num_VFs); >> diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c >> index aab9d51..87a33c0 100644 >> --- a/drivers/pci/pci.c >> +++ b/drivers/pci/pci.c >> @@ -545,7 +545,7 @@ static void pci_restore_bars(struct pci_dev *dev) >> return; >> >> for (i = 0; i < PCI_BRIDGE_RESOURCES; i++) >> - pci_update_resource(dev, i); >> + pci_update_resource(dev, i, false); >> } >> >> static const struct pci_platform_pm_ops *pci_platform_pm; >> diff --git a/drivers/pci/setup-res.c b/drivers/pci/setup-res.c >> index 66c4d8f..e8a50ff 100644 >> --- a/drivers/pci/setup-res.c >> +++ b/drivers/pci/setup-res.c >> @@ -26,7 +26,7 @@ >> #include "pci.h" >> >> >> -void pci_update_resource(struct pci_dev *dev, int resno) >> +void pci_update_resource(struct pci_dev *dev, int resno, bool mmio_force_on) >> { >> struct pci_bus_region region; >> bool disable; >> @@ -81,7 +81,8 @@ void pci_update_resource(struct pci_dev *dev, int resno) >> * disable decoding so that a half-updated BAR won't conflict >> * with another device. >> */ >> - disable = (res->flags & IORESOURCE_MEM_64) && !dev->mmio_always_on; >> + disable = (res->flags & IORESOURCE_MEM_64) && >> + !mmio_force_on && !dev->mmio_always_on; >> if (disable) { >> pci_read_config_word(dev, PCI_COMMAND, &cmd); >> pci_write_config_word(dev, PCI_COMMAND, >> @@ -310,7 +311,7 @@ int pci_assign_resource(struct pci_dev *dev, int resno) >> res->flags &= ~IORESOURCE_STARTALIGN; >> dev_info(&dev->dev, "BAR %d: assigned %pR\n", resno, res); >> if (resno < PCI_BRIDGE_RESOURCES) >> - pci_update_resource(dev, resno); >> + pci_update_resource(dev, resno, false); >> >> return 0; >> } >> @@ -350,7 +351,7 @@ int pci_reassign_resource(struct pci_dev *dev, int resno, resource_size_t addsiz >> dev_info(&dev->dev, "BAR %d: reassigned %pR (expanded by %#llx)\n", >> resno, res, (unsigned long long) addsize); >> if (resno < PCI_BRIDGE_RESOURCES) >> - pci_update_resource(dev, resno); >> + pci_update_resource(dev, resno, false); >> >> return 0; >> } >> diff --git a/include/linux/pci.h b/include/linux/pci.h >> index 0ab8359..99231d1 100644 >> --- a/include/linux/pci.h >> +++ b/include/linux/pci.h >> @@ -1039,7 +1039,7 @@ int pci_try_reset_bus(struct pci_bus *bus); >> void pci_reset_secondary_bus(struct pci_dev *dev); >> void pcibios_reset_secondary_bus(struct pci_dev *dev); >> void pci_reset_bridge_secondary_bus(struct pci_dev *dev); >> -void pci_update_resource(struct pci_dev *dev, int resno); >> +void pci_update_resource(struct pci_dev *dev, int resno, bool mmio_force_on); >> int __must_check pci_assign_resource(struct pci_dev *dev, int i); >> int __must_check pci_reassign_resource(struct pci_dev *dev, int i, resource_size_t add_size, resource_size_t align); >> int pci_select_bars(struct pci_dev *dev, unsigned long flags); >> -- >> 2.1.0 >> >> -- >> To unsubscribe from this list: send the line "unsubscribe linux-pci" in >> the body of a message to majordomo@xxxxxxxxxxxxxxx >> More majordomo info at http://vger.kernel.org/majordomo-info.html > -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html