RE: [PATCH 6.3 317/431] drm/amdgpu: Fix usage of UMC fill record in RAS

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



[AMD Official Use Only - General]

Reviewed-by: Tao Zhou <tao.zhou1@xxxxxxx>

> -----Original Message-----
> From: Greg Kroah-Hartman <gregkh@xxxxxxxxxxxxxxxxxxx>
> Sent: Sunday, July 9, 2023 7:14 PM
> To: stable@xxxxxxxxxxxxxxx
> Cc: Greg Kroah-Hartman <gregkh@xxxxxxxxxxxxxxxxxxx>; patches@xxxxxxxxxxxxxxx;
> Zhou1, Tao <Tao.Zhou1@xxxxxxx>; Zhang, Hawking
> <Hawking.Zhang@xxxxxxx>; Deucher, Alexander
> <Alexander.Deucher@xxxxxxx>; Tuikov, Luben <Luben.Tuikov@xxxxxxx>;
> Deucher, Alexander <Alexander.Deucher@xxxxxxx>; Sasha Levin
> <sashal@xxxxxxxxxx>
> Subject: [PATCH 6.3 317/431] drm/amdgpu: Fix usage of UMC fill record in RAS
>
> From: Luben Tuikov <luben.tuikov@xxxxxxx>
>
> [ Upstream commit 71344a718a9fda8c551cdc4381d354f9a9907f6f ]
>
> The fixed commit listed in the Fixes tag below, introduced a bug in
> amdgpu_ras.c::amdgpu_reserve_page_direct(), in that when introducing the new
> amdgpu_umc_fill_error_record() and internally in that new function the physical
> address (argument "uint64_t retired_page"--wrong name) is right-shifted by
> AMDGPU_GPU_PAGE_SHIFT. Thus, in amdgpu_reserve_page_direct() when we
> pass "address" to that new function, we should NOT right-shift it, since this
> results, erroneously, in the page address to be 0 for first
> 2^(2*AMDGPU_GPU_PAGE_SHIFT) memory addresses.
>
> This commit fixes this bug.
>
> Cc: Tao Zhou <tao.zhou1@xxxxxxx>
> Cc: Hawking Zhang <Hawking.Zhang@xxxxxxx>
> Cc: Alex Deucher <Alexander.Deucher@xxxxxxx>
> Fixes: 400013b268cb ("drm/amdgpu: add umc_fill_error_record to make code
> more simple")
> Signed-off-by: Luben Tuikov <luben.tuikov@xxxxxxx>
> Link: https://lore.kernel.org/r/20230610113536.10621-1-luben.tuikov@xxxxxxx
> Reviewed-by: Hawking Zhang <Hawking.Zhang@xxxxxxx>
> Signed-off-by: Alex Deucher <alexander.deucher@xxxxxxx>
> Signed-off-by: Sasha Levin <sashal@xxxxxxxxxx>
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c | 3 +--
>  1 file changed, 1 insertion(+), 2 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
> index 63dfcc98152d5..b3daca6372a90 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
> @@ -170,8 +170,7 @@ static int amdgpu_reserve_page_direct(struct
> amdgpu_device *adev, uint64_t addre
>
>       memset(&err_rec, 0x0, sizeof(struct eeprom_table_record));
>       err_data.err_addr = &err_rec;
> -     amdgpu_umc_fill_error_record(&err_data, address,
> -                     (address >> AMDGPU_GPU_PAGE_SHIFT), 0, 0);
> +     amdgpu_umc_fill_error_record(&err_data, address, address, 0, 0);
>
>       if (amdgpu_bad_page_threshold != 0) {
>               amdgpu_ras_add_bad_pages(adev, err_data.err_addr,
> --
> 2.39.2
>
>





[Index of Archives]     [Linux Kernel]     [Kernel Development Newbies]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite Hiking]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux