[PATCH] drm/amdgpu: Move reset domain locking in DPC handler

Andrey Grodzovsky <andrey.grodzovsky@xxxxxxx> · Wed, 13 Apr 2022 15:31:00 -0400

Lock reset domain unconditionally because on resume
we unlock it unconditionally.
This solved mutex deadlock when handling both FATAL
and non FATAL PCI errors one after another.

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@xxxxxxx>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 14 +++++++-------
 1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index 1cc488a767d8..c65f25e3a0fc 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -5531,18 +5531,18 @@ pci_ers_result_t amdgpu_pci_error_detected(struct pci_dev *pdev, pci_channel_sta
 
 	adev->pci_channel_state = state;
 
+	/*
+	 * Locking adev->reset_domain->sem will prevent any external access
+	 * to GPU during PCI error recovery
+	 */
+	amdgpu_device_lock_reset_domain(adev->reset_domain);
+	amdgpu_device_set_mp1_state(adev);
+
 	switch (state) {
 	case pci_channel_io_normal:
 		return PCI_ERS_RESULT_CAN_RECOVER;
 	/* Fatal error, prepare for slot reset */
 	case pci_channel_io_frozen:
-		/*
-		 * Locking adev->reset_domain->sem will prevent any external access
-		 * to GPU during PCI error recovery
-		 */
-		amdgpu_device_lock_reset_domain(adev->reset_domain);
-		amdgpu_device_set_mp1_state(adev);
-
 		/*
 		 * Block any work scheduling as we do for regular GPU reset
 		 * for the duration of the recovery
-- 
2.25.1