[AMD Official Use Only - General] Did you see that? It's a patch which I created by git-format-patch. Anyway I will paste the changes below. I was suspecting maybe we need some waits for smu running. diff --git a/drivers/gpu/drm/amd/pm/legacy-dpm/si_dpm.c b/drivers/gpu/drm/amd/pm/legacy-dpm/si_dpm.c index 49c398ec0aaf..9f308a021b2d 100644 --- a/drivers/gpu/drm/amd/pm/legacy-dpm/si_dpm.c +++ b/drivers/gpu/drm/amd/pm/legacy-dpm/si_dpm.c @@ -6814,6 +6814,7 @@ static int si_dpm_enable(struct amdgpu_device *adev) struct si_power_info *si_pi = si_get_pi(adev); struct amdgpu_ps *boot_ps = adev->pm.dpm.boot_ps; int ret; + int i; if (amdgpu_si_is_smc_running(adev)) return -EINVAL; @@ -6909,6 +6910,17 @@ static int si_dpm_enable(struct amdgpu_device *adev) si_program_response_times(adev); si_program_ds_registers(adev); si_dpm_start_smc(adev); + /* Waiting for smc alive */ + for (i = 0; i < adev->usec_timeout; i++) { + if (amdgpu_si_is_smc_running(adev)) + break; + udelay(1); + } + if (i >= adev->usec_timeout) { + DRM_ERROR("Timedout on waiting for smu running\n"); + return -EINVAL; + } + ret = si_notify_smc_display_change(adev, false); if (ret) { DRM_ERROR("si_notify_smc_display_change failed\n"); BR Evan > -----Original Message----- > From: Christian König <ckoenig.leichtzumerken@xxxxxxxxx> > Sent: Thursday, November 24, 2022 6:06 PM > To: Quan, Evan <Evan.Quan@xxxxxxx>; 李真能 <lizhenneng@xxxxxxxxxx>; > Michel Dänzer <michel.daenzer@xxxxxxxxxxx>; Koenig, Christian > <Christian.Koenig@xxxxxxx>; Deucher, Alexander > <Alexander.Deucher@xxxxxxx> > Cc: dri-devel@xxxxxxxxxxxxxxxxxxxxx; Pan, Xinhui <Xinhui.Pan@xxxxxxx>; > linux-kernel@xxxxxxxxxxxxxxx; amd-gfx@xxxxxxxxxxxxxxxxxxxxx > Subject: Re: [PATCH] drm/amdgpu: add mb for si > > That's not a patch but some binary file? > > Christian. > > Am 24.11.22 um 11:04 schrieb Quan, Evan: > > [AMD Official Use Only - General] > > > > Could the attached patch help? > > > > Evan > >> -----Original Message----- > >> From: amd-gfx <amd-gfx-bounces@xxxxxxxxxxxxxxxxxxxxx> On Behalf > Of ??? > >> Sent: Friday, November 18, 2022 5:25 PM > >> To: Michel Dänzer <michel.daenzer@xxxxxxxxxxx>; Koenig, Christian > >> <Christian.Koenig@xxxxxxx>; Deucher, Alexander > >> <Alexander.Deucher@xxxxxxx> > >> Cc: amd-gfx@xxxxxxxxxxxxxxxxxxxxx; Pan, Xinhui <Xinhui.Pan@xxxxxxx>; > >> linux-kernel@xxxxxxxxxxxxxxx; dri-devel@xxxxxxxxxxxxxxxxxxxxx > >> Subject: Re: [PATCH] drm/amdgpu: add mb for si > >> > >> > >> 在 2022/11/18 17:18, Michel Dänzer 写道: > >>> On 11/18/22 09:01, Christian König wrote: > >>>> Am 18.11.22 um 08:48 schrieb Zhenneng Li: > >>>>> During reboot test on arm64 platform, it may failure on boot, so > >>>>> add this mb in smc. > >>>>> > >>>>> The error message are as follows: > >>>>> [ 6.996395][ 7] [ T295] [drm:amdgpu_device_ip_late_init > >>>>> [amdgpu]] *ERROR* > >>>>> late_init of IP block <si_dpm> failed -22 [ > >>>>> 7.006919][ 7] [ T295] amdgpu 0000:04:00.0: > >>>>> amdgpu_device_ip_late_init failed [ 7.014224][ 7] [ T295] > >>>>> amdgpu > >>>>> 0000:04:00.0: Fatal error during GPU init > >>>> Memory barries are not supposed to be sprinkled around like this, > >>>> you > >> need to give a detailed explanation why this is necessary. > >>>> Regards, > >>>> Christian. > >>>> > >>>>> Signed-off-by: Zhenneng Li <lizhenneng@xxxxxxxxxx> > >>>>> --- > >>>>> drivers/gpu/drm/amd/pm/legacy-dpm/si_smc.c | 2 ++ > >>>>> 1 file changed, 2 insertions(+) > >>>>> > >>>>> diff --git a/drivers/gpu/drm/amd/pm/legacy-dpm/si_smc.c > >>>>> b/drivers/gpu/drm/amd/pm/legacy-dpm/si_smc.c > >>>>> index 8f994ffa9cd1..c7656f22278d 100644 > >>>>> --- a/drivers/gpu/drm/amd/pm/legacy-dpm/si_smc.c > >>>>> +++ b/drivers/gpu/drm/amd/pm/legacy-dpm/si_smc.c > >>>>> @@ -155,6 +155,8 @@ bool amdgpu_si_is_smc_running(struct > >>>>> amdgpu_device *adev) > >>>>> u32 rst = RREG32_SMC(SMC_SYSCON_RESET_CNTL); > >>>>> u32 clk = RREG32_SMC(SMC_SYSCON_CLOCK_CNTL_0); > >>>>> + mb(); > >>>>> + > >>>>> if (!(rst & RST_REG) && !(clk & CK_DISABLE)) > >>>>> return true; > >>> In particular, it makes no sense in this specific place, since it > >>> cannot directly > >> affect the values of rst & clk. > >> > >> I thinks so too. > >> > >> But when I do reboot test using nine desktop machines, there maybe > >> report this error on one or two machines after Hundreds of times or > >> Thousands of times reboot test, at the beginning, I use msleep() > >> instead of mb(), these two methods are all works, but I don't know what > is the root case. > >> > >> I use this method on other verdor's oland card, this error message > >> are reported again. > >> > >> What could be the root reason? > >> > >> test environmen: > >> > >> graphics card: OLAND 0x1002:0x6611 0x1642:0x1869 0x87 > >> > >> driver: amdgpu > >> > >> os: ubuntu 2004 > >> > >> platform: arm64 > >> > >> kernel: 5.4.18 > >>
<<attachment: winmail.dat>>