Re: please help with intermittent s2idle problem on AMD laptop

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




On 10/15/2024 04:02, Corey Hickey wrote:
> On 2024-10-13 23:58, Shyam Sundar S K wrote:
>>> As far as I can tell from the code, I need to load the amd_pmc
>>> module with enable_stb=1.
>>>
>>> lizard:~# rmmod amd_pmc
>>> lizard:~# modprobe amd_pmc enable_stb=1
>>>
>>> If I do that, though:
>>> 1. There is an error: 'amd_pmc AMDI0009:00: SMU cmd failed. err: 0xff'
>>
>> this is expected as the command is not supported on PMFW loaded on
>> your system.
>>
>> and..
>>
>> ret=-5 is expected on your system, because it does not support EFR
>> (Enhanced Firmware Reporting).
>>
>>> 2. There is a kernel WARNING (which I will paste in full below):
>>>      ioremap on RAM at 0x0000000000000000 - 0x0000000000ffffff 3.
>>> The expected files in debugfs do not appear.
>>>
>>
>> This is happening because, the ioremap() is happening for addr 0x0.
>> Ideally you should have got the physical address from the mailbox
>> command. But that does not seem to happen.
>>
>> I suspect that on your system, the STB is not enabled. Can you check
>> the following path to see if that helps?
>>
>> AMD CBS -> SMU Debug Options -> SMU Feature Config Limits -> STB To
>> DRAM Log <Enabled>
>>
>> If DRAM log is disabled, then that should be enabled to attempt to
>> take a stb log.
> 
> Unfortunately, the AMD CBS menu does not seem to be available in this
> laptop BIOS config. I can try checking with Framework support for that,
> but I don't know if I will have any success.
> 
>> No need to look at mp2_stb.c as it is meant for chromebook use-cases.
>> So, it will not take this path on your framework system.
> 
> Ah ok, thanks for clarifying.
> 
>> Note that I have looked at your debug patch, but it may not be in the
>> right direction.
>>
>> I would suggest:
>> - reload the amd_pmc driver with dyndbg
>> - Put the system to sleep "echo mem > /sys/power/state" and take the
>> dmesg logs
>> - get the dump of /sys/kernel/debug/amd_pmc/s0ix_stats and
>> /sys/kernel/debug/amd_pmc/smu_fw_info
> 
> I had not used dyndbg before. I found the documentation and ran this:
> 
> 
> lizard:~# rmmod amd_pmc
> lizard:~# modprobe amd_pmc dyndbg='file drivers/platform/x86/amd/pmc/*
> +p'
> 
> The result of this is:
> 
> lizard:~# grep amd_pmc /proc/dynamic_debug/control
> drivers/platform/x86/amd/pmc/mp2_stb.c:126
> [amd_pmc]amd_mp2_process_cmd =p "Invalid STB data\n"
> drivers/platform/x86/amd/pmc/mp2_stb.c:132
> [amd_pmc]amd_mp2_process_cmd =p "Unsupported length\n"
> drivers/platform/x86/amd/pmc/pmc.c:276
> [amd_pmc]amd_pmc_stb_debugfs_open_v2 =p "S2D force flush not
> supported: %d\n"
> drivers/platform/x86/amd/pmc/pmc.c:448
> [amd_pmc]amd_pmc_get_smu_version =p "SMU program %u version is
> %u.%u.%u\n"
> drivers/platform/x86/amd/pmc/pmc.c:609 [amd_pmc]amd_pmc_idlemask_read
> =p "SMU idlemask s0i3: 0x%x\n"
> drivers/platform/x86/amd/pmc/pmc.c:678 [amd_pmc]amd_pmc_dump_registers
> =p "AMD_%s_REGISTER_RESPONSE:%x\n"
> drivers/platform/x86/amd/pmc/pmc.c:681 [amd_pmc]amd_pmc_dump_registers
> =p "AMD_%s_REGISTER_ARGUMENT:%x\n"
> drivers/platform/x86/amd/pmc/pmc.c:684 [amd_pmc]amd_pmc_dump_registers
> =p "AMD_%s_REGISTER_MESSAGE:%x\n"
> drivers/platform/x86/amd/pmc/pmc.c:832 [amd_pmc]amd_pmc_verify_czn_rtc
> =p "alarm not enabled\n"
> drivers/platform/x86/amd/pmc/pmc.c:854 [amd_pmc]amd_pmc_verify_czn_rtc
> =p "wakeup timer programmed for %lld seconds\n"
> 
> ...so I think I got that right, but let me know if you meant something
> different.
> 
> I wasn't sure if you meant for me to run this with enable_stb=1 or not,
> so first I did this with the default of enable_stb omitted.
> 
> 
> lizard:~# echo 0 > /sys/devices/pnp0/00:01/rtc/rtc0/wakealarm && echo
> +5 > /sys/devices/pnp0/00:01/rtc/rtc0/wakealarm && echo mem >
> /sys/power/state
> lizard:~# dmesg -r > 2024-10-14/dmesg.default
> lizard:~# cp -p /sys/kernel/debug/amd_pmc/s0ix_stats
> 2024-10-14/s0ix_stats.default
> lizard:~# cp -p /sys/kernel/debug/amd_pmc/smu_fw_info
> 2024-10-14/smu_fw_info.default
> 
> In the dmesg, I don't see any further debug messages; I don't think any
> calls to dev_dbg() are being run (I could be wrong).
> 
> Here are the files captured from debugfs:
> 
> 
> lizard:~# tail -n +1 2024-10-14/s*
> ==> 2024-10-14/s0ix_stats.default <==
> === S0ix statistics ===
> S0ix Entry Time: 38004185743
> S0ix Exit Time: 38177312687
> Residency Time: 3606811

Here the "Residency Time" seems like a valid number, that tells that
system was in low power state for that specified duration.

> 
> ==> 2024-10-14/smu_fw_info.default <==
> 
> === SMU Statistics ===
> Table Version: 3
> Hint Count: 1
> Last S0i3 Status: Success

Status "success" reflectes that the system indeed entered s2idle.

> Time (in us) to S0i3: 385104
> Time (in us) in S0i3: 3606811
> Time (in us) to resume from S0i3: 115504

and here, it resumed back from s2didle. So, I am not sure where is the
problem. Like Mario mentioned, you can keep only one SSD and give it a
try to see if you hit a s2idle problem.

But atleast in the current logs shared. Everything seems passing.

> 
> === Active time (in us) ===
> DISPLAY  : 0
> VDD      : 0
> ACP      : 0
> VCN      : 0
> ISP      : 0
> DF       : 0
> USB3_0   : 0
> USB3_1   : 0
> USB3_3   : 0
> USB3_4   : 0
> USB4_0   : -4647714815446351872
> USB4_1   : 0
> MPM      : 8366
> JPEG     : 0
> IPU      : 0
> UMSCH    : 0
> 
> 
> The USB4_0 value stands out as anomalous; I don't know how significant
> that
> is.

negative number could be a problem from the PMFW or the USB4 IP
itseems. Numbers here would matter when the system is not entering s2idle.

> 
> 
> I then retried with enable_stb=1
> 
> lizard:~# rmmod amd_pmc
> lizard:~# modprobe amd_pmc enable_stb=1 dyndbg='file
> drivers/platform/x86/amd/pmc/* +p'
> lizard:~# echo 0 > /sys/devices/pnp0/00:01/rtc/rtc0/wakealarm && echo
> +5 > /sys/devices/pnp0/00:01/rtc/rtc0/wakealarm && echo mem >
> /sys/power/state
> lizard:~# dmesg -r > 2024-10-14/dmesg.enable_stb
> 
> 
> This time, I get some debug lines from amd_pmc_dump_registers(). I am
> including my debug patch here--I think it gives a bit of context that
> I can understand better.
> 
> 
> <6>[ 1143.655752] amd_pmc_probe: 1
> <6>[ 1143.655763] amd_pmc_probe: 2
> <6>[ 1143.655764] amd_pmc_probe: 3
> <6>[ 1143.655773] amd_pmc_probe: 4
> <6>[ 1143.655796] amd_pmc_probe: 5
> <6>[ 1143.655797] amd_pmc_probe: 6
> <6>[ 1143.655798] amd_pmc_is_stb_supported cpu_id: 5352
> <7>[ 1143.684758] amd_pmc AMDI0009:00: AMD_S2D_REGISTER_RESPONSE:1
> <7>[ 1143.684768] amd_pmc AMDI0009:00: AMD_S2D_REGISTER_ARGUMENT:100000
> <7>[ 1143.684770] amd_pmc AMDI0009:00: AMD_S2D_REGISTER_MESSAGE:85
> <6>[ 1143.684772] amd_pmc_s2d_init size: 1048576
> <3>[ 1143.684873] amd_pmc AMDI0009:00: SMU cmd failed. err: 0xff
> <7>[ 1143.684886] amd_pmc AMDI0009:00: AMD_S2D_REGISTER_RESPONSE:ff
> <7>[ 1143.684894] amd_pmc AMDI0009:00: AMD_S2D_REGISTER_ARGUMENT:5
> <7>[ 1143.684901] amd_pmc AMDI0009:00: AMD_S2D_REGISTER_MESSAGE:85
> <6>[ 1143.684903] amd_pmc_s2d_init s2d_dram_size ret: -5
> <7>[ 1143.715734] amd_pmc AMDI0009:00: AMD_S2D_REGISTER_RESPONSE:1
> <7>[ 1143.715741] amd_pmc AMDI0009:00: AMD_S2D_REGISTER_ARGUMENT:0
> <7>[ 1143.715744] amd_pmc AMDI0009:00: AMD_S2D_REGISTER_MESSAGE:85
> <7>[ 1143.746780] amd_pmc AMDI0009:00: AMD_S2D_REGISTER_RESPONSE:1
> <7>[ 1143.746790] amd_pmc AMDI0009:00: AMD_S2D_REGISTER_ARGUMENT:0
> <7>[ 1143.746793] amd_pmc AMDI0009:00: AMD_S2D_REGISTER_MESSAGE:85
> <6>[ 1143.746795] amd_pmc_s2d_init p_a_l: 0 p_a_hi: 0 s_p_a: 0 sz:
> 16777216
> 

High and low addresses are zero, because STB is not enabled on your
system. So S2D (Spill to DRAM) mailbox commands are expected to fail.
You will have to contact Frame.work team to get you the STB feature
enabled.

Thanks,
Shyam




[Index of Archives]     [Linux Kernel Development]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux