Re: [PM] bfcc1e67ff: kernel-selftests.breakpoints.step_after_suspend_test.fail

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 10/20/21 9:00 AM, Rafael J. Wysocki wrote:
> On Wed, Oct 20, 2021 at 5:34 PM Florian Fainelli <f.fainelli@xxxxxxxxx> wrote:
>>
>>
>>
>> On 10/20/2021 6:49 AM, Rafael J. Wysocki wrote:
>>> On Tue, Oct 19, 2021 at 9:04 PM Florian Fainelli <f.fainelli@xxxxxxxxx> wrote:
>>>>
>>>> On 10/19/21 11:53 AM, Rafael J. Wysocki wrote:
>>>>> On 10/15/2021 9:40 PM, Florian Fainelli wrote:
>>>>>> On 10/15/21 11:45 AM, Rafael J. Wysocki wrote:
>>>>>>> On 10/14/2021 11:55 PM, Florian Fainelli wrote:
>>>>>>>> On 10/14/21 12:23 PM, Rafael J. Wysocki wrote:
>>>>>>>>> On 10/14/2021 6:26 PM, Florian Fainelli wrote:
>>>>>>>>>> On 10/14/21 12:57 AM, kernel test robot wrote:
>>>>>>>>>>> Greeting,
>>>>>>>>>>>
>>>>>>>>>>> FYI, we noticed the following commit (built with gcc-9):
>>>>>>>>>>>
>>>>>>>>>>> commit: bfcc1e67ff1e4aa8bfe2ca57f99390fc284c799d ("PM: sleep: Do not
>>>>>>>>>>> assume that "mem" is always present")
>>>>>>>>>>> https://git.kernel.org/cgit/linux/kernel/git/next/linux-next.git
>>>>>>>>>>> master
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> in testcase: kernel-selftests
>>>>>>>>>>> version: kernel-selftests-x86_64-c8c9111a-1_20210929
>>>>>>>>>>> with following parameters:
>>>>>>>>>>>
>>>>>>>>>>>        group: group-00
>>>>>>>>>>>        ucode: 0x11
>>>>>>>>>>>
>>>>>>>>>>> test-description: The kernel contains a set of "self tests" under
>>>>>>>>>>> the
>>>>>>>>>>> tools/testing/selftests/ directory. These are intended to be small
>>>>>>>>>>> unit tests to exercise individual code paths in the kernel.
>>>>>>>>>>> test-url: https://www.kernel.org/doc/Documentation/kselftest.txt
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> on test machine: 288 threads 2 sockets Intel(R) Xeon Phi(TM) CPU
>>>>>>>>>>> 7295
>>>>>>>>>>> @ 1.50GHz with 80G memory
>>>>>>>>>>>
>>>>>>>>>>> caused below changes (please refer to attached dmesg/kmsg for entire
>>>>>>>>>>> log/backtrace):
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> If you fix the issue, kindly add following tag
>>>>>>>>>>> Reported-by: kernel test robot <oliver.sang@xxxxxxxxx>
>>>>>>>>>> Thanks for your report. Assuming that the code responsible for
>>>>>>>>>> registering the suspend operations is drivers/acpi/sleep.c for your
>>>>>>>>>> platform, and that acpi_sleep_suspend_setup() iterated over all
>>>>>>>>>> possible
>>>>>>>>>> sleep states, your platform must somehow be returning that
>>>>>>>>>> ACPI_STATE_S3
>>>>>>>>>> is not a supported state somehow?
>>>>>>>>>>
>>>>>>>>>> Rafael have you ever encountered something like that?
>>>>>>>>> Yes, there are systems with ACPI that don't support S3.
>>>>>>>> OK and do you know what happens when we enter suspend with "mem" in
>>>>>>>> those cases? Do we immediately return because ultimately the firmware
>>>>>>>> does not support ACPI S3?
>>>>>>> "mem" should not be present in the list of available strings then, so it
>>>>>>> should be rejected right away.
>>>>>> Well yes, that was the purpose of the patch I submitted, but assuming
>>>>>> that we did provide "mem" as one of the possible standby modes even
>>>>>> though that was wrong (before patch), and the test was trying to enter
>>>>>> ACPI S3 standby, what would have happened, would the ACPI firmware honor
>>>>>> the request but return an error, or would it actually enter ACPI S3?
>>>>>>
>>>>>> In any case, I will change the test to check that this is a supported
>>>>>> standby mode before trying it.
>>>>>
>>>>> Unfortunately, I will need to revert bfcc1e67ff1e4aa8bfe2, because it
>>>>> breaks user space compatibility and that's got caught properly by the test.
>>>>
>>>> Reverting my commit will break powerpc and other ARM/ARM64 platforms
>>>> where mem is not supported (via PSCI),
>>>
>>> It won't break anything, although the things that didn't work before
>>> will still not work after it.
>>>
>>> And "mem" is always supported even if there are no suspend_ops at all,
>>> in which case it becomes an alternative way to trigger s2idle.
>>>
>>> So, on the affected systems, what's there in /sys/power/?  Is
>>> mem_sleep present?  If so, what's in it?
>>
>> With 4.9 which is what I used initially:
>>
>> # cat /sys/power/state
>> freeze standby
>> # cat /sys/power/
>> pm_async           pm_print_times     pm_wakeup_irq      wakeup_count
>> pm_freeze_timeout  pm_test            state
>>
>> With a newer kernel without my patch:
>>
>> # cat /sys/power/state
>> freeze standby mem
>> # cat /sys/power/mem_sleep
>> s2idle shallow [deep]
> 
> OK, so the "deep" and "shallow" suspend variants appear to be
> supported.  What's the problem with advertising "mem" then?

s2idle and shallow are, but deep is not.

> 
>> # cat /sys/power/
>> mem_sleep          pm_freeze_timeout  pm_wakeup_irq      wakeup_count
>> pm_async           pm_print_times     state
>> pm_debug_messages  pm_test            suspend_stats/
>>
>>
>>>
>>>> I have a change pending for PSCI
>>>> that will actually check that SYSTEM_SUSPEND is supported before
>>>> unconditionally making use of it.
>>>>
>>>>>
>>>>> What happens is that "mem" is a "pointer" to a secondary list of
>>>>> possible states and that generally is "s2idle shallow deep" and if
>>>>> s2idle is the only available option, it will be just "s2idle".
>>>>>
>>>>> This list is there in /sys/power/mem_sleep.
>>>>>
>>>>> It was done this way, because some variants of user space expect "mem"
>>>>> to be always present and don't recognize "freeze" properly.
>>>>>
>>>>> Sorry for the confusion.
>>>>
>>>> So how do we all get our cookie here? Should we just slap an #ifndef
>>>> CONFIG_ACPI in order to allow platforms that do not have "mem" to not
>>>> have it?
>>>
>>> Certainly not.
>>>
>>> I've just hacked my test-bed system with ACPI so it does not register
>>> any suspend_ops at all and I have "freeze mem disk" in
>>> /sys/power/state and "s2idle" in /sys/power/mem_sleep.  Writing "mem"
>>> to /sys/power/state causes s2idle to be carried out.
>>>
>>> Since this is the expected behavior, I'm not sure what the problem is.
>>
>> The problem is advertising "mem" in /sys/power/state when the state is
>> not actually supported by the platform firmware here, whether that
>> translates into the form of s2idle or not. It is not supported, and it
>> should not be there IMHO.
> 
> Well, it is there, because some user space expects it to be there on
> systems supporting any kind of system-wide suspend, including s2idle.
> Like it or not.

But that was not the case before 406e79385f32 ("PM / sleep: System sleep
state selection interface rework") and clearly nobody complained about
that, did they?

> 
> If it is not there, the utilities in question assume that system-wide
> suspend is not supported at all.

What utilities do depend on that? That selftest that does not even check
that "mem" is actually present in /sys/power/state and just fails its
test if it is not, yes it's not great, but that can be fixed.

> 
>> I was late to the game in identifying that,
>> but the 4.9 kernel makes sense to me.
>>
>> Similarly, if you take arch/powerpc/sysdev/fsl_pmc.c only
>> PM_SUSPEND_STANDBY is valid, so advertising mem would be wrong if we
>> don't look at what ->valid tells us.
> 
> Again: "mem" appears in /sys/power/state if the system supports any
> kind of system-wide suspend (because of the expectations of user space
> mentioned above) and mem_sleep decides what it really means.
> 
> And this is documented too (see Documentation/admin-guide/pm/sleep-states.html).

The documentation just states that if the kernel supports *any* suspend
state, then /sys/power/state will be present and likewise for
/sys/power/mem_sleep, it does not say what the contents will be and that
"mem" would always be present in there.
-- 
Florian




[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux