Am 01.06.2018 um 08:41 schrieb Huang Rui: > After defer the execution of gfx/compute ib tests. However, at that time, the > gfx already go into "mid state" of gfxoff. > > PWR_MISC_CNTL_STATUS: PWR_GFXOFF_STATUS field (2:1 bits) > 0 = GFXOFF. > 1 = Transition out of GFXOFF state. > 2 = Not in GFXOFF. > 3 = Transition into GFXOFF. > > If hit the mid state (1 or 3), the doorbell writing interrupt cannot wake up the > gfx back successfully. And the field value is 1 when we issue the ib test at > that, so we got the hang. This is the root cause that we encountered the issue. > > Meanwhile, we cannot set clockgating of GFX after gfx is already in "off" state. > So here we should move the gfx powergating and gfxoff enabling behavior at the > end of initialization behind ib test and clockgating. Mhm, that still looks like a only halve backed solution: 1. What prevents this bug from happening during "normal" IB submission from userspace? 2. Shouldn't we poll the PWR_MISC_CNTL_STATUS register to make sure we are not in any transition phase instead? Regards, Christian. > > Signed-off-by: Huang Rui <ray.huang at amd.com> > Cc: Hawking Zhang <Hawking.Zhang at amd.com> > --- > drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 10 ++++++++++ > drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c | 5 ----- > drivers/gpu/drm/amd/powerplay/amd_powerplay.c | 2 +- > drivers/gpu/drm/amd/powerplay/hwmgr/smu10_hwmgr.c | 4 ++-- > 4 files changed, 13 insertions(+), 8 deletions(-) > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c > index f509d32..e1c8806 100644 > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c > @@ -1723,6 +1723,16 @@ static int amdgpu_device_ip_late_set_cg_state(struct amdgpu_device *adev) > } > } > } > + > + if (adev->powerplay.pp_feature & PP_GFXOFF_MASK) { > + amdgpu_device_ip_set_powergating_state(adev, > + AMD_IP_BLOCK_TYPE_GFX, > + AMD_CG_STATE_GATE); > + amdgpu_device_ip_set_powergating_state(adev, > + AMD_IP_BLOCK_TYPE_SMC, > + AMD_CG_STATE_GATE); > + } > + > return 0; > } > > diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c > index 2c5e2a4..31ecc86 100644 > --- a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c > +++ b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c > @@ -3358,11 +3358,6 @@ static int gfx_v9_0_late_init(void *handle) > if (r) > return r; > > - r = amdgpu_device_ip_set_powergating_state(adev, AMD_IP_BLOCK_TYPE_GFX, > - AMD_PG_STATE_GATE); > - if (r) > - return r; > - > return 0; > } > > diff --git a/drivers/gpu/drm/amd/powerplay/amd_powerplay.c b/drivers/gpu/drm/amd/powerplay/amd_powerplay.c > index b493369..d0e6e2d 100644 > --- a/drivers/gpu/drm/amd/powerplay/amd_powerplay.c > +++ b/drivers/gpu/drm/amd/powerplay/amd_powerplay.c > @@ -245,7 +245,7 @@ static int pp_set_powergating_state(void *handle, > } > > if (hwmgr->hwmgr_func->enable_per_cu_power_gating == NULL) { > - pr_info("%s was not implemented.\n", __func__); > + pr_debug("%s was not implemented.\n", __func__); > return 0; > } > > diff --git a/drivers/gpu/drm/amd/powerplay/hwmgr/smu10_hwmgr.c b/drivers/gpu/drm/amd/powerplay/hwmgr/smu10_hwmgr.c > index 7712eb6..b72d089 100644 > --- a/drivers/gpu/drm/amd/powerplay/hwmgr/smu10_hwmgr.c > +++ b/drivers/gpu/drm/amd/powerplay/hwmgr/smu10_hwmgr.c > @@ -284,7 +284,7 @@ static int smu10_disable_gfx_off(struct pp_hwmgr *hwmgr) > > static int smu10_disable_dpm_tasks(struct pp_hwmgr *hwmgr) > { > - return smu10_disable_gfx_off(hwmgr); > + return 0; > } > > static int smu10_enable_gfx_off(struct pp_hwmgr *hwmgr) > @@ -299,7 +299,7 @@ static int smu10_enable_gfx_off(struct pp_hwmgr *hwmgr) > > static int smu10_enable_dpm_tasks(struct pp_hwmgr *hwmgr) > { > - return smu10_enable_gfx_off(hwmgr); > + return 0; > } > > static int smu10_gfx_off_control(struct pp_hwmgr *hwmgr, bool enable)