[AMD Official Use Only - General] Could the attached patch help? Evan > -----Original Message----- > From: amd-gfx <amd-gfx-bounces@xxxxxxxxxxxxxxxxxxxxx> On Behalf Of ??? > Sent: Friday, November 18, 2022 5:25 PM > To: Michel Dänzer <michel.daenzer@xxxxxxxxxxx>; Koenig, Christian > <Christian.Koenig@xxxxxxx>; Deucher, Alexander > <Alexander.Deucher@xxxxxxx> > Cc: amd-gfx@xxxxxxxxxxxxxxxxxxxxx; Pan, Xinhui <Xinhui.Pan@xxxxxxx>; > linux-kernel@xxxxxxxxxxxxxxx; dri-devel@xxxxxxxxxxxxxxxxxxxxx > Subject: Re: [PATCH] drm/amdgpu: add mb for si > > > 在 2022/11/18 17:18, Michel Dänzer 写道: > > On 11/18/22 09:01, Christian König wrote: > >> Am 18.11.22 um 08:48 schrieb Zhenneng Li: > >>> During reboot test on arm64 platform, it may failure on boot, so add > >>> this mb in smc. > >>> > >>> The error message are as follows: > >>> [ 6.996395][ 7] [ T295] [drm:amdgpu_device_ip_late_init > >>> [amdgpu]] *ERROR* > >>> late_init of IP block <si_dpm> failed -22 [ > >>> 7.006919][ 7] [ T295] amdgpu 0000:04:00.0: > >>> amdgpu_device_ip_late_init failed [ 7.014224][ 7] [ T295] amdgpu > >>> 0000:04:00.0: Fatal error during GPU init > >> Memory barries are not supposed to be sprinkled around like this, you > need to give a detailed explanation why this is necessary. > >> > >> Regards, > >> Christian. > >> > >>> Signed-off-by: Zhenneng Li <lizhenneng@xxxxxxxxxx> > >>> --- > >>> drivers/gpu/drm/amd/pm/legacy-dpm/si_smc.c | 2 ++ > >>> 1 file changed, 2 insertions(+) > >>> > >>> diff --git a/drivers/gpu/drm/amd/pm/legacy-dpm/si_smc.c > >>> b/drivers/gpu/drm/amd/pm/legacy-dpm/si_smc.c > >>> index 8f994ffa9cd1..c7656f22278d 100644 > >>> --- a/drivers/gpu/drm/amd/pm/legacy-dpm/si_smc.c > >>> +++ b/drivers/gpu/drm/amd/pm/legacy-dpm/si_smc.c > >>> @@ -155,6 +155,8 @@ bool amdgpu_si_is_smc_running(struct > >>> amdgpu_device *adev) > >>> u32 rst = RREG32_SMC(SMC_SYSCON_RESET_CNTL); > >>> u32 clk = RREG32_SMC(SMC_SYSCON_CLOCK_CNTL_0); > >>> + mb(); > >>> + > >>> if (!(rst & RST_REG) && !(clk & CK_DISABLE)) > >>> return true; > > In particular, it makes no sense in this specific place, since it cannot directly > affect the values of rst & clk. > > I thinks so too. > > But when I do reboot test using nine desktop machines, there maybe report > this error on one or two machines after Hundreds of times or Thousands of > times reboot test, at the beginning, I use msleep() instead of mb(), these > two methods are all works, but I don't know what is the root case. > > I use this method on other verdor's oland card, this error message are > reported again. > > What could be the root reason? > > test environmen: > > graphics card: OLAND 0x1002:0x6611 0x1642:0x1869 0x87 > > driver: amdgpu > > os: ubuntu 2004 > > platform: arm64 > > kernel: 5.4.18 > > >
<<attachment: winmail.dat>>