RE: [PATCH v2] drm/amd/display: explicitly update clocks when DC is set to D3 in s0i3

"Limonciello, Mario" <Mario.Limonciello@xxxxxxx> · Thu, 6 Jan 2022 13:46:27 +0000

[Public]

> -----Original Message-----
> From: Thorsten Leemhuis <regressions@xxxxxxxxxxxxx>
> Sent: Thursday, January 6, 2022 05:39
> To: Limonciello, Mario <Mario.Limonciello@xxxxxxx>; amd-
> gfx@xxxxxxxxxxxxxxxxxxxxx
> Cc: Zhuo, Qingqing (Lillian) <Qingqing.Zhuo@xxxxxxx>; Scott Bruce
> <smbruce@xxxxxxxxx>; Chris Hixon <linux-kernel-bugs@xxxxxxxxxxxxx>;
> spasswolf@xxxxxx
> Subject: Re: [PATCH v2] drm/amd/display: explicitly update clocks when DC is set
> to D3 in s0i3
> 
> Hi, this is your Linux kernel regression tracker speaking.
> 
> On 05.01.22 18:06, Mario Limonciello wrote:
> > The WA from commit 5965280abd30 ("drm/amd/display: Apply w/a for
> > hard hang on HPD") causes a regression in s0ix where the system will
> > fail to resume properly.  This may be because an HPD was active the last
> > time clocks were updated but clocks didn't get updated again during s0ix.
> >
> > So add an extra call to update clocks as part of the suspend routine:
> > dm_suspend->dc_set_power_state->clk_mgr_optimize_pwr_state
> >
> > In case HPD is set during this time, also check if the call happened during
> > suspend to allow overriding the WA.
> 
> Thx for this. Our of interest: is that something that should still go
> into v5.16?

Not in v2.  There were review comments that led to creating a V3.  Once V3 is
reviewed if there is time remaining, yes.  If not, then it will be a great candidate
for 5.16.1.

> 
> > Cc: Qingqing Zhuo <qingqing.zhuo@xxxxxxx>
> > Reported-by: Scott Bruce <smbruce@xxxxxxxxx>
> > Reported-by: Chris Hixon <linux-kernel-bugs@xxxxxxxxxxxxx>
> > Reported-by: spasswolf@xxxxxx
> > BugLink:
> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fbugzilla.
> kernel.org%2Fshow_bug.cgi%3Fid%3D215436&amp;data=04%7C01%7Cmario.li
> monciello%40amd.com%7C024b889dd8af440abbb008d9d1091f27%7C3dd8961f
> e4884e608e11a82d994e183d%7C0%7C0%7C637770660416166334%7CUnknow
> n%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haW
> wiLCJXVCI6Mn0%3D%7C3000&amp;sdata=rzNDgWba05cL5Z6CTvlblJS96R%2F1
> 8Vh7ku0S%2FQTRbHQ%3D&amp;reserved=0
> > BugLink:
> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgitlab.fr
> eedesktop.org%2Fdrm%2Famd%2F-
> %2Fissues%2F1821&amp;data=04%7C01%7Cmario.limonciello%40amd.com%7
> C024b889dd8af440abbb008d9d1091f27%7C3dd8961fe4884e608e11a82d994e1
> 83d%7C0%7C0%7C637770660416166334%7CUnknown%7CTWFpbGZsb3d8eyJ
> WIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C
> 3000&amp;sdata=TQW1vagoGW1DdNvsTVFKlA7gdJVvWhnxZU6BZPn3MH4%3D
> &amp;reserved=0
> > BugLink:
> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgitlab.fr
> eedesktop.org%2Fdrm%2Famd%2F-
> %2Fissues%2F1852&amp;data=04%7C01%7Cmario.limonciello%40amd.com%7
> C024b889dd8af440abbb008d9d1091f27%7C3dd8961fe4884e608e11a82d994e1
> 83d%7C0%7C0%7C637770660416166334%7CUnknown%7CTWFpbGZsb3d8eyJ
> WIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C
> 3000&amp;sdata=N4C5PRnke4%2Bpswb3oAMhNsPq3AMLK5VuJG7Ht%2F1n0jk
> %3D&amp;reserved=0
> 
> What is "BugLink"? It's not mention in the kernel's documentation, which
> tells people to use "Link:" for linking to bug reports:
> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.ke
> rnel.org%2Fdoc%2Fhtml%2Flatest%2Fprocess%2Fsubmitting-
> patches.html&amp;data=04%7C01%7Cmario.limonciello%40amd.com%7C024b
> 889dd8af440abbb008d9d1091f27%7C3dd8961fe4884e608e11a82d994e183d%7
> C0%7C0%7C637770660416166334%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC
> 4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&a
> mp;sdata=GntDRw3WuCUZ%2Fys9uDAltqDs2zsBR4qQm3KdDA3VBKo%3D&amp;
> reserved=0
> 
> That is not new, bug recently was made more explicit.
> 
> Hence, please consider changing them to "Link:", there are tools out
> there that repy on them (I'm running such a tool, but there might be
> others out there we are not aware of).

Thanks for enlightening me.  If I need to spin for v4 I'll fix it up on next submission,
otherwise I'll fix it up v3 manually before it goes into amd-staging-drm-next.

> 
> FWIW, in principe I agree that we need a seperate tag for this. But then
> we should get this discussed and blessed through the usual channels.
> That's why I recently proposed "Reported:", without much success so far:
> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.ker
> nel.org%2Flkml%2Fc6724c7fb534a46e095e6aef13d996ed9589e578.163904296
> 6.git.linux%40leemhuis.info%2F&amp;data=04%7C01%7Cmario.limonciello%40
> amd.com%7C024b889dd8af440abbb008d9d1091f27%7C3dd8961fe4884e608e1
> 1a82d994e183d%7C0%7C0%7C637770660416166334%7CUnknown%7CTWFpb
> GZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6M
> n0%3D%7C3000&amp;sdata=oBGtPqXfzBcn%2FqUrH7hDQtMIXwIKBY6xh3Qqd0
> xkRjg%3D&amp;reserved=0

At least some distributions kernel teams use BugLink (presumably from their own
tools).

> 
> Ciao, Thorsten
> 
> > Fixes: 5965280abd30 ("drm/amd/display: Apply w/a for hard hang on HPD")
> > Fixes: 1bd3bc745e7f ("drm/amd/display: Extend w/a for hard hang on HPD to
> dcn20")
> > Signed-off-by: Mario Limonciello <mario.limonciello@xxxxxxx>
> > ---
> > changes from v1->v2:
> >  * Add fallthrough statement
> >  * Extend case to check if call was explicitly in s0ix since #1852 showed
> hpd_state
> >    can be set at this time too
> >  * Adjust commit message and title
> >  * Add extra commit and bug fixed to metadata
> >  drivers/gpu/drm/amd/display/dc/clk_mgr/dcn21/rn_clk_mgr.c | 5 ++++-
> >  drivers/gpu/drm/amd/display/dc/core/dc.c                  | 3 +++
> >  2 files changed, 7 insertions(+), 1 deletion(-)
> >
> > diff --git a/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn21/rn_clk_mgr.c
> b/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn21/rn_clk_mgr.c
> > index fbda42313bfe..fa5efe10b779 100644
> > --- a/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn21/rn_clk_mgr.c
> > +++ b/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn21/rn_clk_mgr.c
> > @@ -23,6 +23,8 @@
> >   *
> >   */
> >
> > +#include "amdgpu.h"
> > +
> >  #include "dccg.h"
> >  #include "clk_mgr_internal.h"
> >
> > @@ -126,6 +128,7 @@ static void rn_update_clocks(struct clk_mgr
> *clk_mgr_base,
> >  			bool safe_to_lower)
> >  {
> >  	struct clk_mgr_internal *clk_mgr =
> TO_CLK_MGR_INTERNAL(clk_mgr_base);
> > +	struct amdgpu_device *adev = clk_mgr_base->ctx->driver_context;
> >  	struct dc_clocks *new_clocks = &context->bw_ctx.bw.dcn.clk;
> >  	struct dc *dc = clk_mgr_base->ctx->dc;
> >  	int display_count;
> > @@ -157,7 +160,7 @@ static void rn_update_clocks(struct clk_mgr
> *clk_mgr_base,
> >  			}
> >
> >  			/* if we can go lower, go lower */
> > -			if (display_count == 0 && !hpd_state) {
> > +			if (display_count == 0 && (adev->in_s0ix || !hpd_state))
> {
> >
> 	rn_vbios_smu_set_dcn_low_power_state(clk_mgr,
> DCN_PWR_STATE_LOW_POWER);
> >  				/* update power state */
> >  				clk_mgr_base->clks.pwr_state =
> DCN_PWR_STATE_LOW_POWER;
> > diff --git a/drivers/gpu/drm/amd/display/dc/core/dc.c
> b/drivers/gpu/drm/amd/display/dc/core/dc.c
> > index 91c4874473d6..8e0c820f6892 100644
> > --- a/drivers/gpu/drm/amd/display/dc/core/dc.c
> > +++ b/drivers/gpu/drm/amd/display/dc/core/dc.c
> > @@ -3299,6 +3299,9 @@ void dc_set_power_state(
> >  		}
> >
> >  		break;
> > +	case DC_ACPI_CM_POWER_STATE_D3:
> > +		clk_mgr_optimize_pwr_state(dc, dc->clk_mgr);
> > +		fallthrough;
> >  	default:
> >  		ASSERT(dc->current_state->stream_count == 0);
> >  		/* Zero out the current context so that on resume we start with