From: Joshua Ashton <joshua@xxxxxxxxx> commit 829798c789f567ef6ba4b084c15b7b5f3bd98d51 upstream. As we discussed before[1], soft recovery should be forwarded to userspace, or we can get into a really bad state where apps will keep submitting hanging command buffers cascading us to a hard reset. 1: https://lore.kernel.org/all/bf23d5ed-9a6b-43e7-84ee-8cbfd0d60f18@xxxxxxxxx/ Signed-off-by: Joshua Ashton <joshua@xxxxxxxxx> Reviewed-by: Marek Olšák <marek.olsak@xxxxxxx> Signed-off-by: Christian König <christian.koenig@xxxxxxx> Signed-off-by: Alex Deucher <alexander.deucher@xxxxxxx> (cherry picked from commit 434967aadbbbe3ad9103cc29e9a327de20fdba01) Cc: stable@xxxxxxxxxxxxxxx Signed-off-by: Greg Kroah-Hartman <gregkh@xxxxxxxxxxxxxxxxxxx> --- drivers/gpu/drm/amd/amdgpu/amdgpu_job.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c @@ -258,9 +258,8 @@ amdgpu_job_prepare_job(struct drm_sched_ struct dma_fence *fence = NULL; int r; - /* Ignore soft recovered fences here */ r = drm_sched_entity_error(s_entity); - if (r && r != -ENODATA) + if (r) goto error; if (!fence && job->gang_submit) Patches currently in stable-queue which might be from joshua@xxxxxxxxx are queue-6.6/drm-amdgpu-forward-soft-recovery-errors-to-userspace.patch