This is a note to let you know that I've just added the patch titled drm/amdgpu: Don't query CE and UE errors to the 5.4-stable tree which can be found at: http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=summary The filename of the patch is: drm-amdgpu-don-t-query-ce-and-ue-errors.patch and it can be found in the queue-5.4 subdirectory. If you, or anyone else, feels it should not be added to the stable tree, please let <stable@xxxxxxxxxxxxxxx> know about it. >From dce3d8e1d070900e0feeb06787a319ff9379212c Mon Sep 17 00:00:00 2001 From: Luben Tuikov <luben.tuikov@xxxxxxx> Date: Wed, 12 May 2021 12:33:23 -0400 Subject: drm/amdgpu: Don't query CE and UE errors MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit From: Luben Tuikov <luben.tuikov@xxxxxxx> commit dce3d8e1d070900e0feeb06787a319ff9379212c upstream. On QUERY2 IOCTL don't query counts of correctable and uncorrectable errors, since when RAS is enabled and supported on Vega20 server boards, this takes insurmountably long time, in O(n^3), which slows the system down to the point of it being unusable when we have GUI up. Fixes: ae363a212b14 ("drm/amdgpu: Add a new flag to AMDGPU_CTX_OP_QUERY_STATE2") Cc: Alexander Deucher <Alexander.Deucher@xxxxxxx> Cc: stable@xxxxxxxxxxxxxxx Signed-off-by: Luben Tuikov <luben.tuikov@xxxxxxx> Reviewed-by: Alexander Deucher <Alexander.Deucher@xxxxxxx> Reviewed-by: Christian König <christian.koenig@xxxxxxx> Signed-off-by: Alex Deucher <alexander.deucher@xxxxxxx> Signed-off-by: Greg Kroah-Hartman <gregkh@xxxxxxxxxxxxxxxxxxx> --- drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c | 16 ---------------- 1 file changed, 16 deletions(-) --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c @@ -351,7 +351,6 @@ static int amdgpu_ctx_query2(struct amdg { struct amdgpu_ctx *ctx; struct amdgpu_ctx_mgr *mgr; - unsigned long ras_counter; if (!fpriv) return -EINVAL; @@ -376,21 +375,6 @@ static int amdgpu_ctx_query2(struct amdg if (atomic_read(&ctx->guilty)) out->state.flags |= AMDGPU_CTX_QUERY2_FLAGS_GUILTY; - /*query ue count*/ - ras_counter = amdgpu_ras_query_error_count(adev, false); - /*ras counter is monotonic increasing*/ - if (ras_counter != ctx->ras_counter_ue) { - out->state.flags |= AMDGPU_CTX_QUERY2_FLAGS_RAS_UE; - ctx->ras_counter_ue = ras_counter; - } - - /*query ce count*/ - ras_counter = amdgpu_ras_query_error_count(adev, true); - if (ras_counter != ctx->ras_counter_ce) { - out->state.flags |= AMDGPU_CTX_QUERY2_FLAGS_RAS_CE; - ctx->ras_counter_ce = ras_counter; - } - mutex_unlock(&mgr->lock); return 0; } Patches currently in stable-queue which might be from luben.tuikov@xxxxxxx are queue-5.4/drm-amdgpu-don-t-query-ce-and-ue-errors.patch