Re: [PATCH v3 7/7] drm/amdgpu: Stop any pending reset if another in progress.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Am 2022-05-31 um 11:31 schrieb Felix Kuehling:
Am 2022-05-25 um 15:04 schrieb Andrey Grodzovsky:
We skip rest requests if another one is already in progress.

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@xxxxxxx>
---
  drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 27 ++++++++++++++++++++++
  1 file changed, 27 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index 424571e46cf5..e1f7ee604ea4 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -5054,6 +5054,27 @@ static void amdgpu_device_recheck_guilty_jobs(
      }
  }
  +static inline void amdggpu_device_stop_pedning_resets(struct amdgpu_device* adev)

Typo: pedning -> pending


+{
+    struct amdgpu_ras *con = amdgpu_ras_get_context(adev);
+
+#if defined(CONFIG_DEBUG_FS)
+    if (!amdgpu_sriov_vf(adev))
+        cancel_work(&adev->reset_work);
+#endif
+
+    if (adev->kfd.dev)
+        cancel_work(&adev->kfd.reset_work);

Do you also need to cancel resets from other GPUs in the same hive?

Never mind. I see this is called in a loop over the GPUs in amdgpu_device_gpu_recover.

Other than the typo, this patch is

Reviewed-by: Felix Kuehling <Felix.Kuehling@xxxxxxx>



Regards,
  Felix


+
+    if (amdgpu_sriov_vf(adev))
+        cancel_work(&adev->virt.flr_work);
+
+    if (con && adev->ras_enabled)
+        cancel_work(&con->recovery_work);
+
+}
+
+
  /**
   * amdgpu_device_gpu_recover - reset the asic and recover scheduler
   *
@@ -5209,6 +5230,12 @@ int amdgpu_device_gpu_recover(struct amdgpu_device *adev,
                    r, adev_to_drm(tmp_adev)->unique);
              tmp_adev->asic_reset_res = r;
          }
+
+        /*
+         * Drop all pending non scheduler resets. Scheduler resets
+         * were already dropped during drm_sched_stop
+         */
+        amdggpu_device_stop_pedning_resets(tmp_adev);
      }
        tmp_vram_lost_counter = atomic_read(&((adev)->vram_lost_counter));



[Index of Archives]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux