On Wed, May 20, 2020 at 4:32 AM chen gong <curry.gong@xxxxxxx> wrote: > > [Problem description] > 1. Boot up picasso platform, launches desktop, Don't do anything (APU enter into "gfxoff" state) > 2. Remote login to platform using SSH, then type the command line: > sudo su -c "echo manual > /sys/class/drm/card0/device/power_dpm_force_performance_level" > sudo su -c "echo 2 > /sys/class/drm/card0/device/pp_dpm_sclk" (fix SCLK to 1400MHz) > 3. Move the mouse around in Window > 4. Phenomenon : The screen frozen > > Tester will switch sclk level during glmark2 run time. APU will enter > "gfxoff" state intermittently during glmark2 run time. The system got > hanged if fix GFXCLK to 1400MHz when APU is in "gfxoff" state. > > [Debug] > 1. Fix SCLK to X MHz > 1400: screen frozen, screen black, then OS will reboot. > 1300: screen frozen > 1200: screen frozen, screen black. > 1100: screen frozen, screen black, then OS will reboot. > 1000: screen frozen, screen black. > 900: screen frozen, screen black, then OS will reboot. > 800: Situation Nomal, issue disappear. > 700: Situation Nomal, issue disappear. > 2. SBIOS setting: AMD CBS --> SMU Debug Options -->SMU Debug --> "GFX DLDO Psm Margin Control": > 50 : Situation Nomal, issue disappear. > 45 : Situation Nomal, issue disappear. > 40 : Situation Nomal, issue disappear. > 35 : Situation Nomal, issue disappear. > 30 : screen black. > 25 : screen frozen, then blurred screen. > 20 : screen frozen. > 15 : screen black. > 10 : screen frozen. > 5 : screen frozen, then blurred screen. > 3. Disable GFXOFF feature > Situation Nomal, issue disappear. > > [Why] > Through a period of time debugging with Sys Eng team and SMU team. > > Sys Eng team said this is voltage/frequency marginal issue not a F/W or > H/W bug. This experiment proves that default targetPsm [for f=1400MHz] > is not sufficient when GFXOFF is enabled on Picasso. > > SMU team think it is an odd test conditions to force sclk="1400MHz" when > GPU is in "gfxoff" state,then wake up the GFX. SCLK should be in the > "lowest frequency" when gfxoff. > > [How] > Disable gfxoff when setting manual mode. > > By the way, from the user point of view, now that user switch to manual > mode and force SCLK Frequency, he don't want SCLK be controlled by > workload. It becomes meaningless to "switch to manual mode" if APU enter > "gfxoff" due to lack of workload at this point. > > Tips: Same issue observed on Raven. > > Signed-off-by: chen gong <curry.gong@xxxxxxx> > --- > drivers/gpu/drm/amd/powerplay/hwmgr/smu10_hwmgr.c | 6 ++++++ > 1 file changed, 6 insertions(+) > > diff --git a/drivers/gpu/drm/amd/powerplay/hwmgr/smu10_hwmgr.c b/drivers/gpu/drm/amd/powerplay/hwmgr/smu10_hwmgr.c > index 4f8c1b8..602be63 100644 > --- a/drivers/gpu/drm/amd/powerplay/hwmgr/smu10_hwmgr.c > +++ b/drivers/gpu/drm/amd/powerplay/hwmgr/smu10_hwmgr.c > @@ -565,6 +565,7 @@ static int smu10_hwmgr_backend_fini(struct pp_hwmgr *hwmgr) > static int smu10_dpm_force_dpm_level(struct pp_hwmgr *hwmgr, > enum amd_dpm_forced_level level) > { > + struct amdgpu_device *adev = hwmgr->adev; > struct smu10_hwmgr *data = hwmgr->backend; > uint32_t min_sclk = hwmgr->display_config->min_core_set_clock; > uint32_t min_mclk = hwmgr->display_config->min_mem_set_clock/100; > @@ -730,6 +731,11 @@ static int smu10_dpm_force_dpm_level(struct pp_hwmgr *hwmgr, > NULL); > break; > case AMD_DPM_FORCED_LEVEL_MANUAL: > + if (adev->asic_type == CHIP_RAVEN){ Missing a space between ) and { > + if (adev->rev_id < 8) Have you verified that raven2 variants don't need this? > + smu10_gfx_off_control(hwmgr, false); Where are we re-enabling gfx off when we exit manual mode? Alex > + } > + break; > case AMD_DPM_FORCED_LEVEL_PROFILE_EXIT: > default: > break; > -- > 2.7.4 > > _______________________________________________ > amd-gfx mailing list > amd-gfx@xxxxxxxxxxxxxxxxxxxxx > https://lists.freedesktop.org/mailman/listinfo/amd-gfx _______________________________________________ amd-gfx mailing list amd-gfx@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/amd-gfx