On Mon, Feb 21, 2022 at 7:42 PM Limonciello, Mario <Mario.Limonciello@xxxxxxx> wrote: > > [AMD Official Use Only] > > > > > Attached is another patch to try, testing the hypothesis that the > > > > observed crash is related to CPUs being in idle state that are too > > > > deep for some reason during late suspend and early resume. > > > > > > I tried 3 test kernels: > > > * 5.17-rc4 + Your second debugging patch > > > * 5.17-rc4+ Your first debugging patch > > > * 5.17-rc4 + A hack I wrote that pushed amd-pmc into "later" in the > > suspend > > > using a global symbol called after LPS0 instead of letting it run in noirq stage > > > > > > It works properly on all of those, tried about 5x time in each. > > > > > > Then I confirmed I could still crash it on 5.17-rc4 with my control kernel. > > > > I would do something like the attached patch, then (provided that it works). > > I got a variation of this to work. Let me clean it up some, do some more testing and I'll send > it out to review. OK > Long term - are you opposed to drivers/acpi/x86/s2idle.c moving to drivers/platform/x86/? It is tied to the code in sleep.c, so I'd rather not move it. > I'd really like the stuff amd-pmc does to be a callback after lps0 (which is closer to how it works > on Windows - it's the very last thing). I see. A notifier-based driver interface to be invoked from s2idle.c should work for that. > I feel like keeping the stuff it does as noirq is generally fragile, and I want to avoid this kind > of breakage. Sure.