Hi, Rafael, The only concern to me is that, in thermal_cooling_device_update(), we should handle the cases that the cooling device is current used by one/more thermal zone. say, something like list_for_each_entry(pos, &cdev->thermal_instances, cdev_node) { /* e.g. what to do if tz1 set it to state 1 previously */ } I have not got a clear idea what we should do here. But given that I have confirmed that this patch series fixes the original problem, and the ACPI passive cooling is unlikely to be triggered before CPUFREQ_CREATE_POLICY notification, probably we can address that problem later. Tested-by: Zhang Rui <rui.zhang@xxxxxxxxx> Reviewed-by: Zhang Rui <rui.zhang@xxxxxxxxx> thanks, rui On Mon, 2023-03-13 at 15:24 +0100, Rafael J. Wysocki wrote: > Hi All, > > The first revision of this patch series was posted as > > https://lore.kernel.org/linux-pm/2148907.irdbgypaU6@kreacher/ > > As reported by Rui in this thread: > > Link: > https://lore.kernel.org/linux-pm/53ec1f06f61c984100868926f282647e57ecfb2d.camel@xxxxxxxxx/ > > some recent changes in the thermal core cause the CPU cooling devices > registered by the ACPI processor driver to become unusable in some > cases > and somewhat crippled in general. > > The problem is that the ACPI processor driver changes its > ->get_max_state() > callback return value depending on whether or not cpufreq is > available and > there is a cpufreq policy for a given CPU. However, the thermal core > has > always assumed that the return value of that callback will not > change, which > in fact is relied on by the cooling device statistics code. In > particular, > when the ->get_max_state() grows, the memory buffer allocated for > storing the > statistics will be too small and corruption may ensue as a result. > > For this reason, the issue needs to be addressed in the ACPI > processor driver > and not in the thermal core, but the core needs to help somewhat > too. Namely, > it needs to provide a helper allowing an interested driver to update > the > max_state value for an already registered cooling device in certain > situations > which will also cause the statistics to be rebuilt. > > This series implements the above and for details please refer to the > individual > patch chagelogs. > > Thanks! > > >