On Fri, May 07, 2021 at 12:07:38AM +0200, Lukas Wunner wrote: > On Thu, May 06, 2021 at 04:48:42PM -0500, Bjorn Helgaas wrote: > > On Thu, May 06, 2021 at 08:38:20PM +0300, Konstantin Kharlamov wrote: > > > On Macbook 2013 resuming from s2idle results in external monitor no > > > longer being detected, and dmesg having errors like: > > > > > > pcieport 0000:06:00.0: can't change power state from D3hot to D0 (config space inaccessible) > > > > > > and a stacktrace. The reason turned out that the hw that the quirk > > > powers off does not get powered on back on resume. > > > > quirk_apple_poweroff_thunderbolt() was added in 2014 by 1df5172c5c25 > > ("PCI: Suspend/resume quirks for Apple thunderbolt"). It claims > > "power is automatically restored before resume," so there must be > > something special about s2idle that prevents the power-on. > > With s2idle, the machine isn't suspended via ACPI, so the AML code > which powers the controller off isn't executed. The dance to prepare > the controller for power-off consequently isn't necessary but rather > harmful. > > To get the same power savings as with ACPI suspend, the controller > needs to be powered off via runtime suspend. I posted patches for > that back in 2016. I'm using them on my laptop, they need some > polishing and rebasing before I can repost them due to massive > changes that have happened in the thunderbolt driver in the meantime. > Without these patches, the controller sucks 1.5W of power in s2idle. > > > Obviously the *hardware* hasn't changed since 1df5172c5c25. Is s2idle > > something that wasn't tested back then, or is this problem connected > > to an s2idle change since then? Can we identify a commit that > > introduced this problem? That would help with backporting or stable > > tags. > > Yes I believe the quirk predates the introduction of s2idle by a couple > of years. In an ideal world, we would know which commit introduced s2idle and hence the possibility of hitting this bug, and we would add a Fixes: tag for that commit so we could connect this fix with it. Apart from that, what I don't like about this (and about the original 1df5172c5c25) is that there's no connection to a spec or to documented behavior of the device or of suspend/resume. For example, "With s2idle, the machine isn't suspended via ACPI, so the AML code which powers the controller off isn't executed." AFAICT that isn't actually a required, documented property of s2idle, but rather it reaches into the internal implementation. The code comment "If suspend mode is s2idle, power won't get restored on resume" is similar. !pm_suspend_via_firmware() tells us that platform firmware won't be invoked. But the connection between *that* and "power won't get restored" is unexplained. > > > Signed-off-by: Konstantin Kharlamov <Hi-Angel@xxxxxxxxx> > > Reviewed-by: Lukas Wunner <lukas@xxxxxxxxx> Thanks for looking at this! Bjorn