Re: [PATCH v4] PCI/ASPM: Add back L1 PM Substate save and restore

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

On Thu, Oct 05, 2023 at 12:22:26PM -0500, Bjorn Helgaas wrote:
> On Thu, Oct 05, 2023 at 05:57:52PM +0200, Thomas Witt wrote:
> > On 05/10/2023 17:30, Bjorn Helgaas wrote:
> > ...
> 
> > > Right, without the denylist, I expect Thomas' TUXEDO to fail, but I
> > > still hope we can figure out why.  If we just keep it on the denylist,
> > > that system will suffer from more power consumption than necessary,
> > > but only after suspend/resume.
> > > 
> > > A denylist seems like the absolute last resort.  In this case we don't
> > > know about anything *wrong* with those platforms; all we know is that
> > > our resume path doesn't work.  It's likely that it fails on other
> > > platforms we haven't heard about, too.
> > 
> > The best guess from Mika and David was a firmware issue, but I run the same
> > Firmware revision as Werner. I even reflashed the Firmware, but that did not
> > change anything:
> 
> Possibly a BIOS settings difference?
> 
> > Quoting David Box:
> > > I agree that we should pursue an exception for your system. This is
> > > looking like a firmware bug. One thing we did notice in the turbostat
> > > results is your IRTL (Interrupt Response Time Limit) values are bogus:
> > >
> > > cpu6: MSR_PKGC3_IRTL: 0x0000884e (valid, 79872 ns)
> > > cpu6: MSR_PKGC6_IRTL: 0x00008000 (valid, 0 ns)
> > > cpu6: MSR_PKGC7_IRTL: 0x00008000 (valid, 0 ns)
> > > cpu6: MSR_PKGC8_IRTL: 0x00008000 (valid, 0 ns)
> > > cpu6: MSR_PKGC9_IRTL: 0x00008000 (valid, 0 ns)
> > > cpu6: MSR_PKGC10_IRTL: 0x00008000 (valid, 0 ns)
> > >
> > > This is despite the PKGC configuration register showing that all
> > > states are enabled:
> > >
> > > cpu6: MSR_PKG_CST_CONFIG_CONTROL: 0x1e008008 (UNdemote-C3, UNdemote-C1,
> > demote-
> > C3, demote-C1, locked, pkg-cstate-limit=8 (unlimited))
> > >
> > > Firmware sets this.
> 
> I can't find this discussion, but if there's a firmware issue related
> to IRTL MSRs, I would want the workaround in intel-idle.c or whatever
> code deals with the MSRs, not in the ASPM code.

Unfortunately that discussion never ended up on the mailing list. But in
summary that particular system seems to run pretty hot (if I understand
correctly what David concluded). This is the reason Thomas has the
pm_sleep=deep in the command line and this is why the L1 SS restore then
causes the failure on resume. Without this it works fine but consumes
lot of energy in s2idle.

Can you suggest what we should do with this now?

We got report from Tasev Nikola on
https://bugzilla.kernel.org/show_bug.cgi?id=216782 that even if the Asus
system is removed from the denylist it works so that we can do. However,
with the Thomas' system I'm not sure. If we leave it like this:

static int aspm_l1ss_suspend_via_firmware(const struct dmi_system_id *not_used)
{
        return pm_suspend_via_firmware();
}

static const struct dmi_system_id aspm_l1ss_denylist[] = {
        {
                /*
                 * https://bugzilla.kernel.org/show_bug.cgi?id=216877
                 *
                 * This system needs to use suspend to mem instead of its 
                 * default (suspend to idle) to avoid draining the battery. 
                 * However, the BIOS gets confused if we try to restore the
                 * L1SS registers so avoid doing that if the user forced
                 * suspend to mem. The default suspend to idle on the other
                 * hand needs restoring L1SS to allow the CPU to enter low
                 * power states. This entry should handle both.
                 */
                .callback = aspm_l1ss_suspend_via_firmware,
                .ident = "TUXEDO InfinityBook S 14 v5",
                .matches = {
                        DMI_MATCH(DMI_BOARD_VENDOR, "TUXEDO"),
                        DMI_MATCH(DMI_BOARD_NAME, "L140CU"),
                },
        },
        { }
};

Then it should work with Thomas' system (defaults to deep), and TUXEDO
with default settings (defaults to s2idle), and with the rest of the
world (I hope at least, fingers crossed). Or you want to pursue a
solution without the denylist still? I'm out of ideas what could be
wrong except that when the pm_sleep=deep it means the BIOS is involved
in the suspend/restore of the devices and it may not expect the OS to
touch these registers or something along those lines.



[Index of Archives]     [DMA Engine]     [Linux Coverity]     [Linux USB]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [Greybus]

  Powered by Linux