Re: [PATCH] PCI: Reprogram bridge prefetch registers on resume

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]<

 



Hello Daniel,

Am 07.09.18 um 07:36 schrieb Daniel Drake:
On 38+ Intel-based Asus products, the nvidia GPU becomes unusable
after S3 suspend/resume. The affected products include multiple
generations of nvidia GPUs and Intel SoCs. After resume, nouveau logs
many errors such as:

     fifo: fault 00 [READ] at 0000005555555000 engine 00 [GR] client 04 [HUB/FE] reason 4a [] on channel -1 [007fa91000 unknown]
     DRM: failed to idle channel 0 [DRM]

Similarly, the nvidia proprietary driver also fails after resume
(black screen, 100% CPU usage in Xorg process). We shipped a sample
to Nvidia for diagnosis, and their response indicated that it's a
problem with the parent PCI bridge (on the Intel SoC), not the GPU.

Runtime suspend/resume works fine, only S3 suspend is affected.

We found a workaround: on resume, rewrite the Intel PCI bridge
'Prefetchable Base Upper 32 Bits' register (PCI_PREF_BASE_UPPER32). In
the cases that I checked, this register has value 0 and we just have to
rewrite that value.

It's very strange that rewriting the exact same register value
makes a difference, but it definitely makes the issue go away.
It's not just acting as some kind of memory barrier, because rewriting
other bridge registers does not work around the issue. There's something
magic in this particular register. We have confirmed this on all
the affected models we have in-hands (X542UQ, UX533FD, X530UN, V272UN).

Additionally, this workaround solves an issue where r8169 MSI-X
interrupts were broken after S3 suspend/resume on Asus X441UAR. This
issue was recently worked around in commit 7bb05b85bc2d ("r8169:
don't use MSI-X on RTL8106e"). It also fixes the same issue on
RTL6186evl/8111evl on an Aimfor-tech laptop that we had not yet
patched. I suspect it will also fix the issue that was worked around in
commit 7c53a722459c ("r8169: don't use MSI-X on RTL8168g").

Thomas Martitz reports that this workaround also solves an issue where
the AMD Radeon Polaris 10 GPU on the HP Zbook 14u G5 is unresponsive
after S3 suspend/resume.


I can confirm that this exact patch also helps on my HP Zbook. Thanks for your work on this, resume has been a real pain until now.




  drivers/pci/pci-driver.c | 14 ++++++++++++++
  drivers/pci/setup-bus.c  |  2 +-
  include/linux/pci.h      |  1 +
  3 files changed, 16 insertions(+), 1 deletion(-)

diff --git a/drivers/pci/pci-driver.c b/drivers/pci/pci-driver.c
index bef17c3fca67..034f816570ad 100644
--- a/drivers/pci/pci-driver.c
+++ b/drivers/pci/pci-driver.c
@@ -524,6 +524,20 @@ static void pci_pm_default_resume_early(struct pci_dev *pci_dev)
  	pci_power_up(pci_dev);
  	pci_restore_state(pci_dev);
  	pci_pme_restore(pci_dev);
+
+	/*
+	 * Redo the PCI bridge prefetch register setup.
+	 *
+	 * This works around an Intel PCI bridge issue seen on Asus and HP
+	 * laptops, where the GPU is not usable after S3 resume.
+	 * Even though PCI bridge register contents appear to be intact
+	 * at resume time, rewriting the value of PREF_BASE_UPPER32 is
+	 * required to make the GPU work.
+	 * Windows 10 also reprograms these registers during S3 resume.
+	 */
+	if (pci_dev->class == PCI_CLASS_BRIDGE_PCI << 8)
+		pci_setup_bridge_mmio_pref(pci_dev);
+
  	pci_fixup_device(pci_fixup_resume_early, pci_dev);
  }
diff --git a/drivers/pci/setup-bus.c b/drivers/pci/setup-bus.c
index 79b1824e83b4..cb88288d2a69 100644
--- a/drivers/pci/setup-bus.c
+++ b/drivers/pci/setup-bus.c
@@ -630,7 +630,7 @@ static void pci_setup_bridge_mmio(struct pci_dev *bridge)
  	pci_write_config_dword(bridge, PCI_MEMORY_BASE, l);
  }
-static void pci_setup_bridge_mmio_pref(struct pci_dev *bridge)
+void pci_setup_bridge_mmio_pref(struct pci_dev *bridge)
  {
  	struct resource *res;
  	struct pci_bus_region region;
diff --git a/include/linux/pci.h b/include/linux/pci.h
index e72ca8dd6241..b15828fc26a4 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -934,6 +934,7 @@ struct pci_dev *pci_scan_single_device(struct pci_bus *bus, int devfn);
  void pci_device_add(struct pci_dev *dev, struct pci_bus *bus);
  unsigned int pci_scan_child_bus(struct pci_bus *bus);
  void pci_bus_add_device(struct pci_dev *dev);
+void pci_setup_bridge_mmio_pref(struct pci_dev *bridge);
  void pci_read_bridge_bases(struct pci_bus *child);
  struct resource *pci_find_parent_resource(const struct pci_dev *dev,
  					  struct resource *res);


_______________________________________________
Nouveau mailing list
Nouveau@xxxxxxxxxxxxxxxxxxxxx
https://lists.freedesktop.org/mailman/listinfo/nouveau




[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Security]     [Bugtraq]     [Linux]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux