On Tue, Jun 29, 2021 at 04:55:18PM +0800, Huacai Chen wrote: > Use separate remove()/shutdown() callback, and don't disable PCI device > during shutdown. This can avoid some poweroff/reboot failures. To clarify, I think you mean "leave bus mastering enabled for PCIe ports during poweroff/reboot." Maybe even better would be "leave DMA/MSI/MSI-X enabled for the PCIe port and downstream devices." "Disable PCI device" is a little ambiguous because pci_enable_device() enables memory and I/O decoding for the device BARs, but pci_disable_device() turns off bus mastering and has nothing to do with the BARs. They are unfortunately not symmetric. > The poweroff/reboot failures could easily be reproduced on Loongson > platforms. I think this is not a Loongson-specific problem, instead, is > a problem related to some specific PCI hosts. On some x86 platforms, > radeon/amdgpu devices can cause the same problem [1][2], and commit > faefba95c9e8ca3a ("drm/amdgpu: just suspend the hw on pci shutdown") > can resolve it. This faefba95c9e8ca3a information isn't really connected here because we don't know exactly what the problem is, and the faefba95c9e8ca3a commit log doesn't say anything about DMA/MSI/MSI-X or bus mastering. > As Tiezhu said, this occasionally shutdown or reboot failure is due to > clear PCI_COMMAND_MASTER on the device in do_pci_disable_device() [3]. > > static void do_pci_disable_device(struct pci_dev *dev) > { > u16 pci_command; > > pci_read_config_word(dev, PCI_COMMAND, &pci_command); > if (pci_command & PCI_COMMAND_MASTER) { > pci_command &= ~PCI_COMMAND_MASTER; > pci_write_config_word(dev, PCI_COMMAND, pci_command); > } > > pcibios_disable_device(dev); > } > > When remove "pci_command &= ~PCI_COMMAND_MASTER;", it can work well when > shutdown or reboot. The root cause on Loongson platform is that CPU is > still accessing PCIe devices while poweroff/reboot, and if we disable > the Bus Master Bit at this time, the PCIe controller doesn't forward > requests to downstream devices, and also doesn't send TIMEOUT to CPU, > which causes CPU wait forever (hardware deadlock). Sorry, I don't think this is a root cause. Let me try to dissect this because I don't follow this argument yet. Per PCIe r5.0, sec 7.5.1.1.3: Bus Master Enable - Controls the ability of a Function to issue Memory and I/O Read/Write Requests, and the ability of a Port to forward Memory and I/O Read/Write Requests in the Upstream direction. In both endpoints and bridges, Bus Master Enable controls DMA, MSI, and MSI-X initiated by a PCI *endpoint*. It has nothing to do with CPU accesses to the device, so I don't understand the claim that on Loongson, the CPU is still accessing PCIe devices during poweroff/reboot. You seem to say the CPU hangs because Bus Mastering is disabled. I don't see how that can happen because Bus Mastering isn't involved at all in CPU MMIO transactions. Or maybe this is more LS7A breakage, e.g., the Root Ports don't forward Completions for CPU MMIO reads when Bus Mastering is disabled? Obviously that would be completely broken per spec, which says "This bit does not affect forwarding of Completions in either the Upstream or Downstream direction." In the poweroff/reboot path, we call pci_device_shutdown(), which calls the driver's .shutdown() method if it has one. Most drivers do *not* have one, so they have no idea that poweroff/reboot is happening and have no reason to stop DMA. The portdrv driver *does* have a .shutdown() method, which currently disables Bus Mastering, which means DMA/MSI/MSI-X from downstream devices no longer works, even if their drivers haven't done anything. The point of *this* patch is to leave Bus Mastering *enabled* during poweroff/reboot. That raises the question of WHY poweroff/reboot fails when DMA/MSI/MSI-X are disabled. That's the justification I'd like to see in the commit log. Not just "this makes things work," but "here is the problem, and this is how this patch solves it." > This behavior is a PCIe protocol violation, and will be fixed in new > revisions of hardware (add timeout mechanism for CPU read request, > whether or not Bus Master bit is cleared). There's some confusion here. A CPU read request is an MMIO transaction that should not be affected by Bus Master Enable. > Radeon driver is more difficult than amdgpu due to its confusing symbol > names, and I have maintained an out-of-tree patch for a long time [4]. I poked around a little but couldn't find any posting of this radeon patch. I assume you're pushing it upstream somewhere? Is there resistance? Is there any conversation there that linux-pci should be involved in? Regardless, this doesn't seem relevant for this commit log. > Recently, we found more and more devices can cause the same problem, and > it is very difficult to modify all problematic drivers as radeon/amdgpu > does (the .shutdown callback should make sure there is no DMA activity). Can you include some examples of these other drivers that cause problems? My guess is that these drivers lack .shutdown() methods, so they don't know a reboot or poweroff is in progress? > So, I think modify the PCIe port driver is a simple and effective way. > Because there is no poweroff/reboot problems before cc27b735ad3a75574a6a > ("PCI/portdrv: Turn off PCIe services during shutdown"). And as early > discussed, kexec can still work fine after this patch [5]. > > [1] https://bugs.freedesktop.org/show_bug.cgi?id=97980 > [2] https://bugs.freedesktop.org/show_bug.cgi?id=98638 > [3] https://lore.kernel.org/patchwork/patch/1305067/ > [4] https://github.com/chenhuacai/linux/commit/8da06f9b669831829416a3e9f4d1c57f217a42f0 > [5] http://patchwork.ozlabs.org/project/linux-pci/patch/1600680138-10949-1-git-send-email-chenhc@xxxxxxxxxx/ > > Cc: Sinan Kaya <okaya@xxxxxxxxxx> > Signed-off-by: Huacai Chen <chenhuacai@xxxxxxxxxxx> > Signed-off-by: Tiezhu Yang <yangtiezhu@xxxxxxxxxxx> > --- > drivers/pci/pcie/portdrv.h | 2 +- > drivers/pci/pcie/portdrv_core.c | 6 ++++-- > drivers/pci/pcie/portdrv_pci.c | 15 +++++++++++++-- > 3 files changed, 18 insertions(+), 5 deletions(-) > > diff --git a/drivers/pci/pcie/portdrv.h b/drivers/pci/pcie/portdrv.h > index 2ff5724b8f13..358d7281f6e8 100644 > --- a/drivers/pci/pcie/portdrv.h > +++ b/drivers/pci/pcie/portdrv.h > @@ -117,7 +117,7 @@ int pcie_port_device_resume(struct device *dev); > int pcie_port_device_runtime_suspend(struct device *dev); > int pcie_port_device_runtime_resume(struct device *dev); > #endif > -void pcie_port_device_remove(struct pci_dev *dev); > +void pcie_port_device_remove(struct pci_dev *dev, bool disable); > int __must_check pcie_port_bus_register(void); > void pcie_port_bus_unregister(void); > > diff --git a/drivers/pci/pcie/portdrv_core.c b/drivers/pci/pcie/portdrv_core.c > index e1fed6649c41..98c0a99a41d6 100644 > --- a/drivers/pci/pcie/portdrv_core.c > +++ b/drivers/pci/pcie/portdrv_core.c > @@ -484,11 +484,13 @@ EXPORT_SYMBOL_GPL(pcie_port_find_device); > * Remove PCI Express port service devices associated with given port and > * disable MSI-X or MSI for the port. > */ > -void pcie_port_device_remove(struct pci_dev *dev) > +void pcie_port_device_remove(struct pci_dev *dev, bool disable) > { > device_for_each_child(&dev->dev, NULL, remove_iter); > pci_free_irq_vectors(dev); > - pci_disable_device(dev); > + > + if (disable) > + pci_disable_device(dev); > } > > /** > diff --git a/drivers/pci/pcie/portdrv_pci.c b/drivers/pci/pcie/portdrv_pci.c > index c7ff1eea225a..562fbf3c1ea9 100644 > --- a/drivers/pci/pcie/portdrv_pci.c > +++ b/drivers/pci/pcie/portdrv_pci.c > @@ -147,7 +147,18 @@ static void pcie_portdrv_remove(struct pci_dev *dev) > pm_runtime_dont_use_autosuspend(&dev->dev); > } > > - pcie_port_device_remove(dev); > + pcie_port_device_remove(dev, true); > +} > + > +static void pcie_portdrv_shutdown(struct pci_dev *dev) > +{ > + if (pci_bridge_d3_possible(dev)) { > + pm_runtime_forbid(&dev->dev); > + pm_runtime_get_noresume(&dev->dev); > + pm_runtime_dont_use_autosuspend(&dev->dev); > + } > + > + pcie_port_device_remove(dev, false); > } > > static pci_ers_result_t pcie_portdrv_error_detected(struct pci_dev *dev, > @@ -219,7 +230,7 @@ static struct pci_driver pcie_portdriver = { > > .probe = pcie_portdrv_probe, > .remove = pcie_portdrv_remove, > - .shutdown = pcie_portdrv_remove, > + .shutdown = pcie_portdrv_shutdown, > > .err_handler = &pcie_portdrv_err_handler, > > -- > 2.27.0 >