[AMD Public Use] > -----Original Message----- > From: Tuikov, Luben <Luben.Tuikov@xxxxxxx> > Sent: Wednesday, May 12, 2021 1:03 PM > To: amd-gfx@xxxxxxxxxxxxxxxxxxxxx > Cc: Tuikov, Luben <Luben.Tuikov@xxxxxxx>; Deucher, Alexander > <Alexander.Deucher@xxxxxxx>; stable@xxxxxxxxxxxxxxx > Subject: [PATCH 1/2] drm/amdgpu: Don't query CE and UE errors > > On QUERY2 IOCTL don't query counts of correctable and uncorrectable > errors, since when RAS is enabled and supported on Vega20 server boards, > this takes insurmountably long time, in O(n^3), which slows the system down > to the point of it being unusable when we have GUI up. > > Fixes: ae363a212b14 ("drm/amdgpu: Add a new flag to > AMDGPU_CTX_OP_QUERY_STATE2") > Cc: Alexander Deucher <Alexander.Deucher@xxxxxxx> > Cc: stable@xxxxxxxxxxxxxxx > Signed-off-by: Luben Tuikov <luben.tuikov@xxxxxxx> > --- > drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c | 26 ++++++++++++----------- > -- > 1 file changed, 13 insertions(+), 13 deletions(-) > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c > b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c > index 01fe60fedcbe..d481a33f4eaf 100644 > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c > @@ -363,19 +363,19 @@ static int amdgpu_ctx_query2(struct > amdgpu_device *adev, > out->state.flags |= AMDGPU_CTX_QUERY2_FLAGS_GUILTY; > > /*query ue count*/ > - ras_counter = amdgpu_ras_query_error_count(adev, false); > - /*ras counter is monotonic increasing*/ > - if (ras_counter != ctx->ras_counter_ue) { > - out->state.flags |= AMDGPU_CTX_QUERY2_FLAGS_RAS_UE; > - ctx->ras_counter_ue = ras_counter; > - } > - > - /*query ce count*/ > - ras_counter = amdgpu_ras_query_error_count(adev, true); > - if (ras_counter != ctx->ras_counter_ce) { > - out->state.flags |= AMDGPU_CTX_QUERY2_FLAGS_RAS_CE; > - ctx->ras_counter_ce = ras_counter; > - } > + /* ras_counter = amdgpu_ras_query_error_count(adev, false); */ > + /* /\*ras counter is monotonic increasing*\/ */ > + /* if (ras_counter != ctx->ras_counter_ue) { */ > + /* out->state.flags |= AMDGPU_CTX_QUERY2_FLAGS_RAS_UE; > */ > + /* ctx->ras_counter_ue = ras_counter; */ > + /* } */ > + > + /* /\*query ce count*\/ */ > + /* ras_counter = amdgpu_ras_query_error_count(adev, true); */ > + /* if (ras_counter != ctx->ras_counter_ce) { */ > + /* out->state.flags |= AMDGPU_CTX_QUERY2_FLAGS_RAS_CE; > */ > + /* ctx->ras_counter_ce = ras_counter; */ > + /* } */ > Rather than commenting this out, just drop it in patch 1, and then re-add this in patch 2. Alex > mutex_unlock(&mgr->lock); > return 0; > -- > 2.31.1.527.g2d677e5b15