[AMD Official Use Only - General] Reviewed-by: Tao Zhou <tao.zhou1@xxxxxxx> > -----Original Message----- > From: Greg Kroah-Hartman <gregkh@xxxxxxxxxxxxxxxxxxx> > Sent: Sunday, July 9, 2023 7:14 PM > To: stable@xxxxxxxxxxxxxxx > Cc: Greg Kroah-Hartman <gregkh@xxxxxxxxxxxxxxxxxxx>; patches@xxxxxxxxxxxxxxx; > Zhou1, Tao <Tao.Zhou1@xxxxxxx>; Zhang, Hawking > <Hawking.Zhang@xxxxxxx>; Deucher, Alexander > <Alexander.Deucher@xxxxxxx>; Tuikov, Luben <Luben.Tuikov@xxxxxxx>; > Deucher, Alexander <Alexander.Deucher@xxxxxxx>; Sasha Levin > <sashal@xxxxxxxxxx> > Subject: [PATCH 6.3 317/431] drm/amdgpu: Fix usage of UMC fill record in RAS > > From: Luben Tuikov <luben.tuikov@xxxxxxx> > > [ Upstream commit 71344a718a9fda8c551cdc4381d354f9a9907f6f ] > > The fixed commit listed in the Fixes tag below, introduced a bug in > amdgpu_ras.c::amdgpu_reserve_page_direct(), in that when introducing the new > amdgpu_umc_fill_error_record() and internally in that new function the physical > address (argument "uint64_t retired_page"--wrong name) is right-shifted by > AMDGPU_GPU_PAGE_SHIFT. Thus, in amdgpu_reserve_page_direct() when we > pass "address" to that new function, we should NOT right-shift it, since this > results, erroneously, in the page address to be 0 for first > 2^(2*AMDGPU_GPU_PAGE_SHIFT) memory addresses. > > This commit fixes this bug. > > Cc: Tao Zhou <tao.zhou1@xxxxxxx> > Cc: Hawking Zhang <Hawking.Zhang@xxxxxxx> > Cc: Alex Deucher <Alexander.Deucher@xxxxxxx> > Fixes: 400013b268cb ("drm/amdgpu: add umc_fill_error_record to make code > more simple") > Signed-off-by: Luben Tuikov <luben.tuikov@xxxxxxx> > Link: https://lore.kernel.org/r/20230610113536.10621-1-luben.tuikov@xxxxxxx > Reviewed-by: Hawking Zhang <Hawking.Zhang@xxxxxxx> > Signed-off-by: Alex Deucher <alexander.deucher@xxxxxxx> > Signed-off-by: Sasha Levin <sashal@xxxxxxxxxx> > --- > drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c | 3 +-- > 1 file changed, 1 insertion(+), 2 deletions(-) > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c > b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c > index 63dfcc98152d5..b3daca6372a90 100644 > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c > @@ -170,8 +170,7 @@ static int amdgpu_reserve_page_direct(struct > amdgpu_device *adev, uint64_t addre > > memset(&err_rec, 0x0, sizeof(struct eeprom_table_record)); > err_data.err_addr = &err_rec; > - amdgpu_umc_fill_error_record(&err_data, address, > - (address >> AMDGPU_GPU_PAGE_SHIFT), 0, 0); > + amdgpu_umc_fill_error_record(&err_data, address, address, 0, 0); > > if (amdgpu_bad_page_threshold != 0) { > amdgpu_ras_add_bad_pages(adev, err_data.err_addr, > -- > 2.39.2 > >