Re: [PATCH v3] drm/radeon: Fix EEH during kexec

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




    
On 10/30/19 5:35 AM, Michael Ellerman wrote:
Hi Kyle,

KyleMahlkuch <kmahlkuc@xxxxxxxxxxxxxxxxxx> writes:
From: Kyle Mahlkuch <kmahlkuc@xxxxxxxxxxxxxxxxxx>

During kexec some adapters hit an EEH since they are not properly
shut down in the radeon_pci_shutdown() function. Adding
radeon_suspend_kms() fixes this issue.
Enabled only on PPC because this patch causes issues on some other
boards.
Which adapters hit the issues?

And do we know why they're not shut down correctly in
radeon_pci_shutdown()? That seems like the root cause no?
Hi Michael,
This is hit by the Caicos (edwards2) adapter that I have on ppc. It is not hit
on the Cedar (FirePro) adapter - though I haven't tested this one recently. I'm
not able to test any other adapters. As far as "why", I'm unsure. During
initialization after the kexec we hit an EEH. There could be another point in
the shutdown / start up process where something doesn't get reset correctly.
I'm open to other ideas if you have any.

diff --git a/drivers/gpu/drm/radeon/radeon_drv.c b/drivers/gpu/drm/radeon/radeon_drv.c
index 9e55076..4528f4d 100644
--- a/drivers/gpu/drm/radeon/radeon_drv.c
+++ b/drivers/gpu/drm/radeon/radeon_drv.c
@@ -379,11 +379,25 @@ static int radeon_pci_probe(struct pci_dev *pdev,
 static void
 radeon_pci_shutdown(struct pci_dev *pdev)
 {
+#ifdef CONFIG_PPC64
+	struct drm_device *ddev = pci_get_drvdata(pdev);
+#endif
This local serves no real purpose and could be avoided, which would also
avoid this ifdef.

 	/* if we are running in a VM, make sure the device
 	 * torn down properly on reboot/shutdown
 	 */
 	if (radeon_device_is_virtual())
 		radeon_pci_remove(pdev);
+
+#ifdef CONFIG_PPC64
+	/* Some adapters need to be suspended before a
AFAIK drm uses normal kernel comment style, so this should be:

	/*
	 * Some adapters need to be suspended before a
+	 * shutdown occurs in order to prevent an error
+	 * during kexec.
+	 * Make this power specific becauase it breaks
+	 * some non-power boards.
+	 */
+	radeon_suspend_kms(ddev, true, true, false);
ie, instead do:

	radeon_suspend_kms(pci_get_drvdata(pdev), true, true, false);
I agree, this is a cleaner way to write this patch. I'll update the comment as
well. Thanks for the help. 

+#endif
 }
 
 static int radeon_pmops_suspend(struct device *dev)
-- 
1.8.3.1
cheers

_______________________________________________
amd-gfx mailing list
amd-gfx@xxxxxxxxxxxxxxxxxxxxx
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[Index of Archives]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux