RE: [PATCH] drm/amdgpu: Fix usage of UMC fill record in RAS

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



[AMD Official Use Only - General]

Reviewed-by: Hawking Zhang <Hawking.Zhang@xxxxxxx>

Regards,
Hawking
-----Original Message-----
From: Tuikov, Luben <Luben.Tuikov@xxxxxxx>
Sent: Saturday, June 10, 2023 19:36
To: AMD Graphics <amd-gfx@xxxxxxxxxxxxxxxxxxxxx>
Cc: Tuikov, Luben <Luben.Tuikov@xxxxxxx>; Zhou1, Tao <Tao.Zhou1@xxxxxxx>; Zhang, Hawking <Hawking.Zhang@xxxxxxx>; Deucher, Alexander <Alexander.Deucher@xxxxxxx>
Subject: [PATCH] drm/amdgpu: Fix usage of UMC fill record in RAS

The fixed commit listed in the Fixes tag below, introduced a bug in amdgpu_ras.c::amdgpu_reserve_page_direct(), in that when introducing the new
amdgpu_umc_fill_error_record() and internally in that new function the physical address (argument "uint64_t retired_page"--wrong name) is right-shifted by AMDGPU_GPU_PAGE_SHIFT. Thus, in amdgpu_reserve_page_direct() when we pass "address" to that new function, we should NOT right-shift it, since this results, erroneously, in the page address to be 0 for first
2^(2*AMDGPU_GPU_PAGE_SHIFT) memory addresses.

This commit fixes this bug.

Cc: Tao Zhou <tao.zhou1@xxxxxxx>
Cc: Hawking Zhang <Hawking.Zhang@xxxxxxx>
Cc: Alex Deucher <Alexander.Deucher@xxxxxxx>
Fixes: 400013b268cb66 ("drm/amdgpu: add umc_fill_error_record to make code more simple")
Signed-off-by: Luben Tuikov <luben.tuikov@xxxxxxx>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
index 27a32933cbee3b..700eb180ea60fa 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
@@ -171,8 +171,7 @@ static int amdgpu_reserve_page_direct(struct amdgpu_device *adev, uint64_t addre

        memset(&err_rec, 0x0, sizeof(struct eeprom_table_record));
        err_data.err_addr = &err_rec;
-       amdgpu_umc_fill_error_record(&err_data, address,
-                       (address >> AMDGPU_GPU_PAGE_SHIFT), 0, 0);
+       amdgpu_umc_fill_error_record(&err_data, address, address, 0, 0);

        if (amdgpu_bad_page_threshold != 0) {
                amdgpu_ras_add_bad_pages(adev, err_data.err_addr,

base-commit: 7eda4451177abbc8d2fab24f816a3243dd1808d8
prerequisite-patch-id: f2a3eadc5d7074225109701f1bb43b38bd6024fd
--
2.41.0





[Index of Archives]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux