[AMD Official Use Only - General] Sure, will revert related patch in the next version. Regards, Tao > -----Original Message----- > From: Zhang, Hawking <Hawking.Zhang@xxxxxxx> > Sent: Wednesday, January 17, 2024 8:09 PM > To: Zhou1, Tao <Tao.Zhou1@xxxxxxx>; amd-gfx@xxxxxxxxxxxxxxxxxxxxx > Cc: Zhou1, Tao <Tao.Zhou1@xxxxxxx> > Subject: RE: [PATCH 2/2] update check condition of query for ras page retire > > [AMD Official Use Only - General] > > static ssize_t smu_v13_0_6_get_ecc_info(struct smu_context *smu, > void *table) > { > - /* Support ecc info by default */ > - return 0; > + /* we use debug mode flag instead of this interface */ > + return -EOPNOTSUPP; > } > > Shall we just drop the callback implementation? smu_get_ecc_info will return - > EOPNOTSUPP if the callback is not supported. > > Regards, > Hawking > > -----Original Message----- > From: amd-gfx <amd-gfx-bounces@xxxxxxxxxxxxxxxxxxxxx> On Behalf Of Tao Zhou > Sent: Wednesday, January 17, 2024 17:15 > To: amd-gfx@xxxxxxxxxxxxxxxxxxxxx > Cc: Zhou1, Tao <Tao.Zhou1@xxxxxxx> > Subject: [PATCH 2/2] update check condition of query for ras page retire > > Support page retirement handling in debug mode. > > Signed-off-by: Tao Zhou <tao.zhou1@xxxxxxx> > --- > drivers/gpu/drm/amd/amdgpu/amdgpu_umc.c | 9 +++++++-- > drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_6_ppt.c | 4 ++-- > 2 files changed, 9 insertions(+), 4 deletions(-) > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_umc.c > b/drivers/gpu/drm/amd/amdgpu/amdgpu_umc.c > index 41139bac7643..6df32f0afd89 100644 > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_umc.c > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_umc.c > @@ -90,12 +90,16 @@ static void amdgpu_umc_handle_bad_pages(struct > amdgpu_device *adev, { > struct ras_err_data *err_data = (struct ras_err_data *)ras_error_status; > struct amdgpu_ras *con = amdgpu_ras_get_context(adev); > + unsigned int error_query_mode; > int ret = 0; > > + amdgpu_ras_get_error_query_mode(adev, &error_query_mode); > + > mutex_lock(&con->page_retirement_lock); > > ret = amdgpu_dpm_get_ecc_info(adev, (void *)&(con->umc_ecc)); > - if (ret == -EOPNOTSUPP) { > + if (ret == -EOPNOTSUPP && > + error_query_mode == AMDGPU_RAS_DIRECT_ERROR_QUERY) { > if (adev->umc.ras && adev->umc.ras->ras_block.hw_ops && > adev->umc.ras->ras_block.hw_ops->query_ras_error_count) > adev->umc.ras->ras_block.hw_ops->query_ras_error_count(adev, > ras_error_status); @@ -119,7 +123,8 @@ static void > amdgpu_umc_handle_bad_pages(struct amdgpu_device *adev, > */ > adev->umc.ras->ras_block.hw_ops- > >query_ras_error_address(adev, ras_error_status); > } > - } else if (!ret) { > + } else if (error_query_mode == AMDGPU_RAS_FIRMWARE_ERROR_QUERY > || > + (!ret && error_query_mode == AMDGPU_RAS_DIRECT_ERROR_QUERY)) { > if (adev->umc.ras && > adev->umc.ras->ecc_info_query_ras_error_count) > adev->umc.ras->ecc_info_query_ras_error_count(adev, > ras_error_status); diff --git > a/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_6_ppt.c > b/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_6_ppt.c > index c560f4af214d..d86c9e7fc64b 100644 > --- a/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_6_ppt.c > +++ b/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_6_ppt.c > @@ -2909,8 +2909,8 @@ static int > smu_v13_0_6_select_xgmi_plpd_policy(struct smu_context *smu, static ssize_t > smu_v13_0_6_get_ecc_info(struct smu_context *smu, > void *table) > { > - /* Support ecc info by default */ > - return 0; > + /* we use debug mode flag instead of this interface */ > + return -EOPNOTSUPP; > } > > static const struct pptable_funcs smu_v13_0_6_ppt_funcs = { > -- > 2.35.1 >