Re: [PATCH 0/3] On AMD platforms only offer s2idle w/ proper FW

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 1/11/2022 12:30, Deucher, Alexander wrote:
[Public]

-----Original Message-----
From: Rafael J. Wysocki <rafael@xxxxxxxxxx>
Sent: Tuesday, January 11, 2022 12:45 PM
To: Deucher, Alexander <Alexander.Deucher@xxxxxxx>
Cc: Rafael J. Wysocki <rafael@xxxxxxxxxx>; Limonciello, Mario
<Mario.Limonciello@xxxxxxx>; Rafael J . Wysocki <rjw@xxxxxxxxxxxxx>;
ACPI Devel Maling List <linux-acpi@xxxxxxxxxxxxxxx>; S-k, Shyam-sundar
<Shyam-sundar.S-k@xxxxxxx>; Natikar, Basavaraj
<Basavaraj.Natikar@xxxxxxx>; bjoern.daase@xxxxxxxxx
Subject: Re: [PATCH 0/3] On AMD platforms only offer s2idle w/ proper FW

On Tue, Jan 11, 2022 at 6:32 PM Deucher, Alexander
<Alexander.Deucher@xxxxxxx> wrote:

[AMD Official Use Only]

-----Original Message-----
From: Rafael J. Wysocki <rafael@xxxxxxxxxx>
Sent: Tuesday, January 11, 2022 12:06 PM
To: Limonciello, Mario <Mario.Limonciello@xxxxxxx>
Cc: Rafael J. Wysocki <rafael@xxxxxxxxxx>; Rafael J . Wysocki
<rjw@xxxxxxxxxxxxx>; ACPI Devel Maling List
<linux-acpi@xxxxxxxxxxxxxxx>; S-k, Shyam-sundar
<Shyam-sundar.S-k@xxxxxxx>; Natikar, Basavaraj
<Basavaraj.Natikar@xxxxxxx>; Deucher, Alexander
<Alexander.Deucher@xxxxxxx>; bjoern.daase@xxxxxxxxx
Subject: Re: [PATCH 0/3] On AMD platforms only offer s2idle w/
proper FW

On Tue, Jan 11, 2022 at 5:23 PM Limonciello, Mario
<mario.limonciello@xxxxxxx> wrote:

+Alex

On 1/11/2022 09:52, Rafael J. Wysocki wrote:
On Wed, Jan 5, 2022 at 8:39 PM Mario Limonciello
<mario.limonciello@xxxxxxx> wrote:

Currently the Linux kernel will offer s2idle regardless of
whether the FADT indicates the system should use or on X86 if
the LPS0 ACPI device has been activated.

On some non-AMD platforms s2idle can be offered even without
proper
firmware support.  The power consumption may be higher in these
instances but the system otherwise properly suspends and
resumes.

Well, the idea is that s2idle should not require FW support at all.


May I ask - why?  It's an intentional design decision?

Yes, it is.

It may not be possible to reach the minimum power level of the
platform without FW support, but that should not prevent s2idle
from being used.

On AMD platforms however when the FW has been configured not
to
offer s2idle some different hardware initialization has
occurred such that the system won't properly resume.

That's rather unfortunate.

Can you please share some details on what's going on in those cases?

Technically, without FW support there should be no difference
between the platform state reachable via s2idle and the platform
state reachable via runtime idle.

During resume there is a number of page faults that occur and
during initialization the ring tests fail.  The graphics is
unusable at this time as a result.

The amdgpu code actually *does* distinguish between the 3
different cases of S3, S0ix, and runtime suspend.

But s2idle doesn't guarantee S0ix in any case.

The function "amdgpu_acpi_is_s0ix_active" causes different
codepaths to be used during the suspend routine.

Well, as I said, s2idle need not mean S0ix.

In this particular case that FADT doesn't set the low power idle
bit and that function returns false meaning the s3 codepath is
taken but the hardware didn't go through a reset.

If there is a separate S3 code path, taking it when
pm_suspend_target_state == PM_SUSPEND_TO_IDLE is incorrect.

It *might* also be possible to solve this by mandating an ASIC
reset in such a case (we didn't try).

I'd rather do a PM-runtime path equivalent if the target sleep state
is PM_SUSPEND_TO_IDLE and there is no FW support for S0ix.

However it comes back to my first upleveveled question - is this a
case we really want to support and encourage?  This type of bug
and combination of codepaths is not a case that is going to be well
tested.
This patch series will align the kernel behavior to only what AMD
validates.

But this does not follow the definition of s2idle and its documentation.

At least for devices integrated into the SoC, the power rails are controlled
by the firmware in the SoC.  For S3, the power rails are cut by the FW when
the platform enters S3.  For S0ix, the power rails are cut when all of the
devices on the rail suspended and various conditions are met.  Also, in the
case of some devices, the device has to be in a very specific state for s0ix to
work properly.  The GPU is the big one here.  For S3, the entire GPU has to
be re-initialized at resume.  For S0ix, the GPU's state is largely handled by the
firmware and attempting to re-initialize it won't work unless you reset it.
Integrated AMD graphics don't support runtime power management, only
dGPUs do.  For integrated graphics the firmware dynamically controls the
power at runtime so there is no need to do anything special for runtime pm.
For dGPUs we support d3cold either via ACPI on platforms like all-in-ones
and laptops or via a driver initiated sequence for add-in-cards.

What does S2idle ultimately do when all devices have suspended?  Does it
enter S0ix or S3 at the end when it want to ultimately suspend the platform,
or is the assumption that if all devices have suspended, the that is equivalent
to S0ix or S3?  For AMD platforms, either S3 or S0ix needs to be entered for
the platform to actually power down most of the power rails.  It's not clear to
me what we should do for s2idle.

s2idle will never enter S3, because that requires platform support and
generally is a different thing (eg. some devices need special upfront
preparation for S3, wakeup may need to be configured in a special way etc.).

It will attempt to enter S0ix if possible, but otherwise it will just put CPUs into
the deepest available idle state and stay there until an interrupt (or
equivalent) triggers.

Physically, at a device level, s2idle is more similar to runtime suspend than to
S3, but it uses system-wide suspend callbacks and it requires wakeup to be
disabled for the devices that are not allowed to wake up the system by user
space, that is device_may_wakeup(0 returns false (PM-runtime assumes
wakeup to always be enabled for the suspended devices that can signal
wakeup).

In any case, the ACPI system state for s2idle is always S0.

Thanks.  In that case, it's kind of pointless on AMD platforms then since the power rails will never be turned off for most devices on the system.  Does it even make sense to expose it?  It just gives users a false sense of suspend and then they will probably complain that it uses too much power when in s2idle.

Alex

That's why I thought my patch series made sense.

I guess another (antisocial) response would be to to return false when the suspend callback for amdgpu happens and dev_err mentioning that firmware support is needed for suspend.



[Index of Archives]     [Linux IBM ACPI]     [Linux Power Management]     [Linux Kernel]     [Linux Laptop]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Samba]     [Video 4 Linux]     [Device Mapper]     [Linux Resources]

  Powered by Linux