Re: [PATCH 2/2] drm/amdgpu: set job guilty if reset skipped

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Reviewed-by: Andrey Grodzovsky <andrey.grodzovsky@xxxxxxx>

Andrey

On 1/19/21 7:22 AM, Horace Chen wrote:
If 2 jobs on 2 different ring timed out the at a very short
period, the reset for second job will be skipped because the
reset is already in progress.

But it doesn't mean the second job is not guilty since it
also timed out and can be a bad job. So before skipped out
from the reset, we need to increase karma for this job too.

Signed-off-by: Horace Chen <horace.chen@xxxxxxx>
---
  drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 4 ++++
  1 file changed, 4 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index 9574da3abc32..1d6ff9fe37de 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -4574,6 +4574,8 @@ int amdgpu_device_gpu_recover(struct amdgpu_device *adev,
  			DRM_INFO("Bailing on TDR for s_job:%llx, hive: %llx as another already in progress",
  				job ? job->base.id : -1, hive->hive_id);
  			amdgpu_put_xgmi_hive(hive);
+			if (job)
+				drm_sched_increase_karma(&job->base);
  			return 0;
  		}
  		mutex_lock(&hive->hive_lock);
@@ -4617,6 +4619,8 @@ int amdgpu_device_gpu_recover(struct amdgpu_device *adev,
  					job ? job->base.id : -1);
  		r = 0;
  		/* even we skipped this reset, still need to set the job to guilty */
+		if (job)
+			drm_sched_increase_karma(&job->base);
  		goto skip_recovery;
  	}
_______________________________________________
amd-gfx mailing list
amd-gfx@xxxxxxxxxxxxxxxxxxxxx
https://lists.freedesktop.org/mailman/listinfo/amd-gfx



[Index of Archives]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux