I just debugged a long lasting problem of my ThinkPad X1 Carbon 4th,
model 20FB002UMC: There is a random problem with the 100% fan speed on
resume. After the resume, the fan slowly raises its speed to the
maximum. Reboot does not fix this problem, shutdown and power-on does.
But after the boot and suspend, the problem randomly appears again.
The reproducibility varies on unknown conditions between ~5% to ~99%.
Looking at the mailing lists, this is a common problem of many users
with different ThinkPad models and different kernels. It appeared at
least 3 years ago.
I made sys+proc dumps of the correct and the failed state, and narrowed
the problem to the failed THM0 readout:
Resume with 100% fan speed gives following readouts:
x1:~ # cat /proc/acpi/ibm/thermal
temperatures: -128 -128 0 0 0 0 0 0
But later the resume returned with a properly working fan:
x1:~ # cat /proc/acpi/ibm/thermal
temperatures: 44 -128 0 0 0 0 0 0
It implies that ACPI automatic fan regulation reads bad temperature and
raises the fan speed to the maximum.
Watching dmesg of the failed and succeeded state, I see no significant
differences. The suspend/resume log is not always consistent, but the
standard log looks like, independently on fan regulation success/failure:
x1:~ # dmesg -c | grep -i '\(acpi\|thermal\|ibm\|thinkpad\)'
[12198.794282] ACPI: EC: interrupt blocked
[12198.833773] ACPI: Preparing to enter system sleep state S3
[12198.839765] ACPI: EC: event blocked
[12198.839767] ACPI: EC: EC stopped
[12198.849325] ACPI: Low-level resume complete
[12198.849414] ACPI: EC: EC started
[12198.854035] ACPI: Waking up from system sleep state S3
[12198.866982] ACPI: EC: interrupt unblocked
[12198.936739] ACPI: EC: event unblocked
[12199.080732] thinkpad_acpi: docked into hotplug port replicator
If I compare the whole suspend/resume logs, there are small differences
in particular kernel logs, but again, I found no significant difference.
If I exclude USB and network devices, I see changes only here:
bad to good:
-IRQ 16: no longer affine to CPU3
IRQ 122: no longer affine to CPU3
-IRQ 124: no longer affine to CPU3
+IRQ 123: no longer affine to CPU3
But another bad to good diff looks differently:
IRQ 122: no longer affine to CPU3
IRQ 123: no longer affine to CPU3
IRQ 131: no longer affine to CPU3
+IRQ 137: no longer affine to CPU3
So even this does not give any indication of the error source. (This is
comparison from openSUSE Leap 15.3 kernel 5.3.18-59.5, as I did not get
succeeded fan resume on the recent 5.13~rc7-1.1.g0a4a430 yet.)
Everything looks like a race condition in the ACPI on resume, but as far
as logs show, there is no difference between succeeded and failed state.
Does anybody have any ideas/patches/additional debug messages?
Reproducibility conditions:
Reproduced on 5.13~rc7-1.1.g0a4a430 with (maybe) 100%. Some older
kernels or builds have a lower reproduction rate, so I downgraded to
openSUSE Leap 15.3's kernel 5.3.18-59.5, which allows to get succeeded
resume log.
The problem appears only for automatic fan speed regulation. If I turn
off automatic (ACPI regulated) fan speed, fan does exactly what
expected, if I return automatic mode, fan raises its speed again.
The problem was never seen in Windows. But the problem appears, even if
I set acpi_os_name and acpi_osi to Windows. (Note that I do not know
whether Windows uses automatic fan speed regulated by ACPI.)
The problem appears both with and without dock, with and without
peripherals attached.
Here is a complete boot log from 5.13~rc7-1.1.g0a4a430:
https://drive.google.com/file/d/1-Ijs9Z-fg6LQqjH6iHMpyuEyOWjtWrbs/view?usp=sharing
I failed to get a succeeded fan resume with this kernel yet.
--
Best Regards / S pozdravem,
Stanislav Brabec
software developer
---------------------------------------------------------------------
SUSE LINUX, s. r. o. e-mail: sbrabec@xxxxxxxx
Křižíkova 148/34 (Corso IIa) tel: +420 284 084 060
186 00 Praha 8-Karlín fax: +420 284 084 001
Czech Republic http://www.suse.cz/
PGP: 830B 40D5 9E05 35D8 5E27 6FA3 717C 209F A04F CD76
_______________________________________________
ibm-acpi-devel mailing list
ibm-acpi-devel@xxxxxxxxxxxxxxxxxxxxx
https://lists.sourceforge.net/lists/listinfo/ibm-acpi-devel