Hi Mario, On 12/6/22 00:02, Limonciello, Mario wrote: > [Public] > > > >> -----Original Message----- >> From: Hans de Goede <hdegoede@xxxxxxxxxx> >> Sent: Thursday, November 17, 2022 10:09 >> To: Limonciello, Mario <Mario.Limonciello@xxxxxxx>; S-k, Shyam-sundar >> <Shyam-sundar.S-k@xxxxxxx> >> Cc: Mahapatra, Rajib <Rajib.Mahapatra@xxxxxxx>; Raul Rangel >> <rrangel@xxxxxxxxxxxx>; Mark Gross <markgross@xxxxxxxxxx>; platform- >> driver-x86@xxxxxxxxxxxxxxx; linux-kernel@xxxxxxxxxxxxxxx >> Subject: Re: [PATCH] platform/x86/amd: pmc: Add a workaround for an s0i3 >> issue on Cezanne >> >> Hi, >> >> On 11/17/22 17:06, Limonciello, Mario wrote: >>> [Public] >>> >>> >>> >>>> -----Original Message----- >>>> From: Hans de Goede <hdegoede@xxxxxxxxxx> >>>> Sent: Thursday, November 17, 2022 08:06 >>>> To: Limonciello, Mario <Mario.Limonciello@xxxxxxx>; S-k, Shyam-sundar >>>> <Shyam-sundar.S-k@xxxxxxx> >>>> Cc: Mahapatra, Rajib <Rajib.Mahapatra@xxxxxxx>; Raul Rangel >>>> <rrangel@xxxxxxxxxxxx>; Mark Gross <markgross@xxxxxxxxxx>; >> platform- >>>> driver-x86@xxxxxxxxxxxxxxx; linux-kernel@xxxxxxxxxxxxxxx >>>> Subject: Re: [PATCH] platform/x86/amd: pmc: Add a workaround for an >> s0i3 >>>> issue on Cezanne >>>> >>>> Hi Mario, >>>> >>>> On 11/16/22 16:43, Mario Limonciello wrote: >>>>> Cezanne platforms under the right circumstances have a synchronization >>>>> problem where attempting to enter s2idle may fail if the x86 cores are >>>>> put into HLT before hardware resume from the previous attempt has >>>>> completed. >>>>> >>>>> To avoid this issue add a 10-20ms delay before entering s2idle another >>>>> time. This workaround will only be applied on interrupts that wake the >>>>> hardware but don't break the s2idle loop. >>>>> >>>>> Cc: "Mahapatra, Rajib" <Rajib.Mahapatra@xxxxxxx> >>>>> Cc: "Raul Rangel" <rrangel@xxxxxxxxxxxx> >>>>> Signed-off-by: Mario Limonciello <mario.limonciello@xxxxxxx> >>>> >>>> Thank you for your patch, I've applied this patch to my review-hans >>>> branch: >>>> >> https://git.k %2F&data=05%7C01%7Cmario.limonciello%40amd.com%7Cb3c04b4449 >> 154cad4f8208dac8b61509%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C >> 0%7C638042981632459900%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLj >> AwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C% >> 7C%7C&sdata=3ZzdcI0BsHknBInf8V4MfrmNCkkc2U9ygYf4IP25LJ4%3D& >> amp;reserved=0 >> ernel.org%2Fpub%2Fscm%2Flinux%2Fkernel%2Fgit%2Fpdx86%2Fplatform- >>>> drivers-x86.git%2Flog%2F%3Fh%3Dreview- >>>> >> hans&data=05%7C01%7Cmario.limonciello%40amd.com%7C674f8bf7a8 >>>> >> 114f83a3b408dac8a4d941%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C >>>> >> 0%7C638042907591739047%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLj >>>> >> AwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C% >>>> >> 7C%7C&sdata=XYwl%2FOvUFy%2Bgz9EY9oa35M%2BkLf%2Bud8PKXynQ >>>> FlrUdoE%3D&reserved=0 >>>> >>>> Please let me know if it important to get this as a fix into 6.1, >>>> I wasn't really planning on doing any more fixes pull-reqs for 6.1, >>>> but I can do one if necessary. >>>> >>> >>> AFAIK it's a corner case. I think it can wait until 6.2, but I think it should >> probably >>> be Cc to 6.1 stable (which has the ability to run code in the check()) phase. >> >> Ok, I have added a: >> >> Cc: stable@xxxxxxxxxxxxxxx # 6.1 >> >> to the commit msg. >> >> Regards, >> >> Hans > > Hi Hans, > > I just wanted to update you on this workaround. Previously it was believed to > only be a very specific set of circumstances that happened on chromebooks running > coreboot and an EC running cros_ec being utilized with unfortunate timing. > > However it turns out that it can be "relatively" easily reproduced on UEFI machines > as well though by suspending the laptop and then issuing anything that causes an > ACPI event that otherwise shouldn't break the s2idle loop (such as closing the lid or > unplugging the power adapter). > > What will happen is that the SOC enters the deepest state up until the time of that > ACPI event and then never enters again. The most common case this will break I think > is someone suspends the laptop in GNOME, closes the lid and then tosses it in their bag. > If you examine /sys/kernel/debug/amd_pmc/* you'll see that the duration of time in > deepest state matches the time between suspending in GNOME and closing the lid. > > I wanted to provide you that context to decide if this should still try to catch this in > a 6.1 final pull request or not. Had I known how widely this helped at that time I > would have advocated accordingly. Based on the above I have prepared a pull-req to Linus with just this single patch in it to get this added to 6.1 . I'll Cc you on the pull-req submission to Linus. Regards, Hans >>>>> --- >>>>> drivers/platform/x86/amd/pmc.c | 6 ++++++ >>>>> 1 file changed, 6 insertions(+) >>>>> >>>>> diff --git a/drivers/platform/x86/amd/pmc.c >>>> b/drivers/platform/x86/amd/pmc.c >>>>> index ef4ae977b8e0..439d282aafd1 100644 >>>>> --- a/drivers/platform/x86/amd/pmc.c >>>>> +++ b/drivers/platform/x86/amd/pmc.c >>>>> @@ -739,8 +739,14 @@ static void amd_pmc_s2idle_prepare(void) >>>>> static void amd_pmc_s2idle_check(void) >>>>> { >>>>> struct amd_pmc_dev *pdev = &pmc; >>>>> + struct smu_metrics table; >>>>> int rc; >>>>> >>>>> + /* CZN: Ensure that future s0i3 entry attempts at least 10ms passed >>>> */ >>>>> + if (pdev->cpu_id == AMD_CPU_ID_CZN && >>>> !get_metrics_table(pdev, &table) && >>>>> + table.s0i3_last_entry_status) >>>>> + usleep_range(10000, 20000); >>>>> + >>>>> /* Dump the IdleMask before we add to the STB */ >>>>> amd_pmc_idlemask_read(pdev, pdev->dev, NULL); >>>>> >>> >