On 10/28/2024 8:11 PM, Alex Deucher wrote: > Ping? > > On Fri, Oct 18, 2024 at 11:47 AM Alex Deucher <alexdeucher@xxxxxxxxx> wrote: >> >> Ping? >> >> On Tue, Oct 15, 2024 at 2:28 PM Alex Deucher <alexander.deucher@xxxxxxx> wrote: >>> >>> Add messages to make it clear when a per ring reset >>> happens. This is helpful for debugging and aligns with >>> other reset methods. >>> >>> Signed-off-by: Alex Deucher <alexander.deucher@xxxxxxx> >>> --- >>> drivers/gpu/drm/amd/amdgpu/amdgpu_job.c | 3 +++ >>> 1 file changed, 3 insertions(+) >>> >>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c >>> index 102742f1faa2..2d60552a13ac 100644 >>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c >>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c >>> @@ -137,6 +137,7 @@ static enum drm_gpu_sched_stat amdgpu_job_timedout(struct drm_sched_job *s_job) >>> /* attempt a per ring reset */ >>> if (amdgpu_gpu_recovery && >>> ring->funcs->reset) { >>> + dev_err(adev->dev, "Starting %s ring reset\n", s_job->sched->name); Is dev_err intentional or dev_info is good enough? Also, suggest to add ring name to fail/pass messages. Thanks, Lijo >>> /* stop the scheduler, but don't mess with the >>> * bad job yet because if ring reset fails >>> * we'll fall back to full GPU reset. >>> @@ -150,8 +151,10 @@ static enum drm_gpu_sched_stat amdgpu_job_timedout(struct drm_sched_job *s_job) >>> amdgpu_fence_driver_force_completion(ring); >>> if (amdgpu_ring_sched_ready(ring)) >>> drm_sched_start(&ring->sched); >>> + dev_err(adev->dev, "Ring reset success\n");>>> goto exit; >>> } >>> + dev_err(adev->dev, "Ring reset failure\n"); >>> } >>> >>> if (amdgpu_device_should_recover_gpu(ring->adev)) { >>> -- >>> 2.46.2 >>>