Re: [PATCH 2/2] drm/amdgpu: set job guilty if reset skipped

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Reviewed-by: Andrey Grodzovsky <andrey.grodzovsky@xxxxxxx>

Andrey

On 1/14/21 8:37 AM, Horace Chen wrote:
If 2 jobs on 2 different ring timed out the at a very
short period, the reset for second job will be skipped
because the reset is already in progress.

But it doesn't mean the second job is not guilty since it also
timed out and can be a bad job. So before skipped out from the
reset, we need to increase karma for this job too.

Signed-off-by: Horace Chen <horace.chen@xxxxxxx>
---
  drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 4 ++++
  1 file changed, 4 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index a28e138ac72c..d1112e29c8b4 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -4572,6 +4572,8 @@ int amdgpu_device_gpu_recover(struct amdgpu_device *adev,
  		if (atomic_cmpxchg(&hive->in_reset, 0, 1) != 0) {
  			DRM_INFO("Bailing on TDR for s_job:%llx, hive: %llx as another already in progress",
  				job ? job->base.id : -1, hive->hive_id);
+			if(job)
+				drm_sched_increase_karma(&job->base);
  			amdgpu_put_xgmi_hive(hive);
  			return 0;
  		}
@@ -4596,6 +4598,8 @@ int amdgpu_device_gpu_recover(struct amdgpu_device *adev,
  			dev_info(adev->dev, "Bailing on TDR for s_job:%llx, as another already in progress",
  					job ? job->base.id : -1);
  			r = 0;
+			if(job)
+				drm_sched_increase_karma(&job->base);
  			goto skip_recovery;
  		}
_______________________________________________
amd-gfx mailing list
amd-gfx@xxxxxxxxxxxxxxxxxxxxx
https://lists.freedesktop.org/mailman/listinfo/amd-gfx



[Index of Archives]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux