On Tue, 16 Jun 2020 17:55:01 +0200 Sebastian Andrzej Siewior <bigeasy@xxxxxxxxxxxxx> wrote: > On 2020-06-16 10:13:27 [+0200], Stephen Berman wrote: >> Yes, thanks, that did it. Trace attached. > > So TZ10 is a temperature sensor of some kind on your motherboard. In > your v5.6 dmesg there is: > | thermal LNXTHERM:00: registered as thermal_zone0 > | ACPI: Thermal Zone [TZ10] (17 C) > > So. In /sys/class/thermal/thermal_zone0/device/path you should also see > TZ10. And /sys/class/thermal/thermal_zone0/temp should show the actual > value. > This comes from the "thermal" module. Yes, TZ10 was in the thermal_zone0/device/path and the value in thermal_zone0/temp was 16800. > Looking at the trace, might query the temperature every second which > somehow results in "Dispatching Notify on". I don't understand how it > gets from reading of the temperature to the notify part, maybe it is > part of the ACPI… > > However. Could you please make sure that the thermal module is not > loaded at system startup? Adding > thermal.off=1 > > to the kernel commandline should do the trick. And you should see > thermal control disabled > > in dmesg. Confirmed. And the value in thermal_zone0/temp was now 33000. > That means your thermal_zone0 with TZ10 does not show up in > /sys and nothing should schedule the work-items. This in turn should > allow you to shutdown your system without the delay. It did! > If this works, could you please try to load the module with tzp=300? > If you add this > thermal.tzp=300 > > to the kernel commandline then it should do the trick. You can verify it > by > cat /sys/module/thermal/parameters/tzp > > This should change the polling interval from what ACPI says to 30secs. > This should ensure that you don't have so many worker waiting. So you > should also be able to shutdown the system. Your assessment and predictions are right on the mark! I'm fine with the thermal.tzp=300 workaround, but it would be good to find out why this problem started with commit 6d25be57, if my git bisection was correct, or if it wasn't, then at least somewhere between 5.1.0 and 5.2.0. Or can you already deduce why? If not, I'd be more than happy to continue applying any patches or trying any suggestions you have, if you want to continue debugging this issue. In any case, thanks for pursuing it to this point. Steve Berman