[Public] > -----Original Message----- > From: Wentland, Harry <Harry.Wentland@xxxxxxx> > Sent: Wednesday, January 5, 2022 15:26 > To: Limonciello, Mario <Mario.Limonciello@xxxxxxx>; amd- > gfx@xxxxxxxxxxxxxxxxxxxxx > Cc: Zhuo, Qingqing (Lillian) <Qingqing.Zhuo@xxxxxxx>; Scott Bruce > <smbruce@xxxxxxxxx>; Chris Hixon <linux-kernel-bugs@xxxxxxxxxxxxx>; > spasswolf@xxxxxx > Subject: Re: [PATCH v2] drm/amd/display: explicitly update clocks when DC is set > to D3 in s0i3 > > On 2022-01-05 12:06, Mario Limonciello wrote: > > The WA from commit 5965280abd30 ("drm/amd/display: Apply w/a for > > hard hang on HPD") causes a regression in s0ix where the system will > > fail to resume properly. This may be because an HPD was active the last > > time clocks were updated but clocks didn't get updated again during s0ix. > > > > So add an extra call to update clocks as part of the suspend routine: > > dm_suspend->dc_set_power_state->clk_mgr_optimize_pwr_state > > > > In case HPD is set during this time, also check if the call happened during > > suspend to allow overriding the WA. > > > > Cc: Qingqing Zhuo <qingqing.zhuo@xxxxxxx> > > Reported-by: Scott Bruce <smbruce@xxxxxxxxx> > > Reported-by: Chris Hixon <linux-kernel-bugs@xxxxxxxxxxxxx> > > Reported-by: spasswolf@xxxxxx > > BugLink: https://bugzilla.kernel.org/show_bug.cgi?id=215436 > > BugLink: https://gitlab.freedesktop.org/drm/amd/-/issues/1821 > > BugLink: https://gitlab.freedesktop.org/drm/amd/-/issues/1852 > > Fixes: 5965280abd30 ("drm/amd/display: Apply w/a for hard hang on HPD") > > Fixes: 1bd3bc745e7f ("drm/amd/display: Extend w/a for hard hang on HPD to > dcn20") > > Signed-off-by: Mario Limonciello <mario.limonciello@xxxxxxx> > > --- > > changes from v1->v2: > > * Add fallthrough statement > > * Extend case to check if call was explicitly in s0ix since #1852 showed > hpd_state > > can be set at this time too > > * Adjust commit message and title > > * Add extra commit and bug fixed to metadata > > drivers/gpu/drm/amd/display/dc/clk_mgr/dcn21/rn_clk_mgr.c | 5 ++++- > > drivers/gpu/drm/amd/display/dc/core/dc.c | 3 +++ > > 2 files changed, 7 insertions(+), 1 deletion(-) > > > > diff --git a/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn21/rn_clk_mgr.c > b/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn21/rn_clk_mgr.c > > index fbda42313bfe..fa5efe10b779 100644 > > --- a/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn21/rn_clk_mgr.c > > +++ b/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn21/rn_clk_mgr.c > > @@ -23,6 +23,8 @@ > > * > > */ > > > > +#include "amdgpu.h" > > + > > #include "dccg.h" > > #include "clk_mgr_internal.h" > > > > @@ -126,6 +128,7 @@ static void rn_update_clocks(struct clk_mgr > *clk_mgr_base, > > bool safe_to_lower) > > { > > struct clk_mgr_internal *clk_mgr = > TO_CLK_MGR_INTERNAL(clk_mgr_base); > > + struct amdgpu_device *adev = clk_mgr_base->ctx->driver_context; > > DC code is shared on other OSes. driver_context won't be an instance > of amdgpu_device in those cases. > > If we need adev->in_s0ix we need to find a way to plumb it through > DC structs and interfaces. > OK. I'll see what that takes to do. I was hoping the original version would have sufficed but Chris (on CC) had reported that hpd_state was still active When this was called again during the suspend context. <snip> Jan 05 15:39:08.562580 kernel: kauditd_printk_skb: 35 callbacks suppressed Jan 05 15:43:54.575923 kernel: wlo1: deauthenticating from ZZ:ZZ:ZZ:ZZ:ZZ:ZZ by local choice (Reason: 3=DEAUTH_LEAVING) Jan 05 15:43:54.699223 kernel: amdgpu 0000:04:00.0: amdgpu: [rn_update_clocks] display count: 0, in_s0ix: 0, hpd_state: 0 Jan 05 15:43:55.922674 kernel: amdgpu 0000:04:00.0: amdgpu: [rn_update_clocks] display count: 1, in_s0ix: 0, hpd_state: 1 Jan 05 15:43:55.979221 kernel: rfkill: input handler enabled Jan 05 15:43:57.712689 kernel: PM: suspend entry (s2idle) Jan 05 15:44:07.302069 kernel: Filesystems sync: 0.008 seconds Jan 05 15:44:07.302256 kernel: Freezing user space processes ... (elapsed 0.001 seconds) done. Jan 05 15:44:07.302306 kernel: OOM killer disabled. Jan 05 15:44:07.302344 kernel: Freezing remaining freezable tasks ... (elapsed 0.001 seconds) done. Jan 05 15:44:07.302378 kernel: printk: Suspending console(s) (use no_console_suspend to debug) Jan 05 15:44:07.302416 kernel: amdgpu 0000:04:00.0: amdgpu: [rn_update_clocks] display count: 0, in_s0ix: 1, hpd_state: 1 </snip> So I suppose the other question to ask though - why is that hpd interrupt still active? Shouldn't it have been suspended by then? @Zhuo, Qingqing (Lillian) Maybe some other code is needed to suspend it on the way down? It was for spasswolf@xxxxxx (which is why I had the simpler version of the patch for v1). > Harry > > > struct dc_clocks *new_clocks = &context->bw_ctx.bw.dcn.clk; > > struct dc *dc = clk_mgr_base->ctx->dc; > > int display_count; > > @@ -157,7 +160,7 @@ static void rn_update_clocks(struct clk_mgr > *clk_mgr_base, > > } > > > > /* if we can go lower, go lower */ > > - if (display_count == 0 && !hpd_state) { > > + if (display_count == 0 && (adev->in_s0ix || !hpd_state)) > { > > > rn_vbios_smu_set_dcn_low_power_state(clk_mgr, > DCN_PWR_STATE_LOW_POWER); > > /* update power state */ > > clk_mgr_base->clks.pwr_state = > DCN_PWR_STATE_LOW_POWER; > > diff --git a/drivers/gpu/drm/amd/display/dc/core/dc.c > b/drivers/gpu/drm/amd/display/dc/core/dc.c > > index 91c4874473d6..8e0c820f6892 100644 > > --- a/drivers/gpu/drm/amd/display/dc/core/dc.c > > +++ b/drivers/gpu/drm/amd/display/dc/core/dc.c > > @@ -3299,6 +3299,9 @@ void dc_set_power_state( > > } > > > > break; > > + case DC_ACPI_CM_POWER_STATE_D3: > > + clk_mgr_optimize_pwr_state(dc, dc->clk_mgr); > > + fallthrough; > > default: > > ASSERT(dc->current_state->stream_count == 0); > > /* Zero out the current context so that on resume we start with