This is a note to let you know that I've just added the patch titled thermal: core: Do not fail cdev registration because of invalid initial state to the 6.9-stable tree which can be found at: http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=summary The filename of the patch is: thermal-core-do-not-fail-cdev-registration-because-o.patch and it can be found in the queue-6.9 subdirectory. If you, or anyone else, feels it should not be added to the stable tree, please let <stable@xxxxxxxxxxxxxxx> know about it. commit 81b1cc7ba4999d1a369184ed8a0c03a72b4f9372 Author: Rafael J. Wysocki <rafael.j.wysocki@xxxxxxxxx> Date: Thu Jun 6 20:27:19 2024 +0200 thermal: core: Do not fail cdev registration because of invalid initial state [ Upstream commit 1af89dedc8a58006d8e385b1e0d2cd24df8a3b69 ] It is reported that commit 31a0fa0019b0 ("thermal/debugfs: Pass cooling device state to thermal_debug_cdev_add()") causes the ACPI fan driver to fail probing on some systems which turns out to be due to the _FST control method returning an invalid value until _FSL is first evaluated for the given fan. If this happens, the .get_cur_state() cooling device callback returns an error and __thermal_cooling_device_register() fails as uses that callback after commit 31a0fa0019b0. Arguably, _FST should not return an invalid value even if it is evaluated before _FSL, so this may be regarded as a platform firmware issue, but at the same time it is not a good enough reason for failing the cooling device registration where the initial cooling device state is only needed to initialize a thermal debug facility. Accordingly, modify __thermal_cooling_device_register() to avoid calling thermal_debug_cdev_add() instead of returning an error if the initial .get_cur_state() callback invocation fails. Fixes: 31a0fa0019b0 ("thermal/debugfs: Pass cooling device state to thermal_debug_cdev_add()") Closes: https://lore.kernel.org/linux-acpi/20240530153727.843378-1-laura.nao@xxxxxxxxxxxxx Reported-by: Laura Nao <laura.nao@xxxxxxxxxxxxx> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@xxxxxxxxx> Acked-by: Daniel Lezcano <daniel.lezcano@xxxxxxxxxx> Tested-by: Laura Nao <laura.nao@xxxxxxxxxxxxx> Signed-off-by: Sasha Levin <sashal@xxxxxxxxxx> diff --git a/drivers/thermal/thermal_core.c b/drivers/thermal/thermal_core.c index 38b7d02384d7c..258482036f1e9 100644 --- a/drivers/thermal/thermal_core.c +++ b/drivers/thermal/thermal_core.c @@ -936,9 +936,17 @@ __thermal_cooling_device_register(struct device_node *np, if (ret) goto out_cdev_type; + /* + * The cooling device's current state is only needed for debug + * initialization below, so a failure to get it does not cause + * the entire cooling device initialization to fail. However, + * the debug will not work for the device if its initial state + * cannot be determined and drivers are responsible for ensuring + * that this will not happen. + */ ret = cdev->ops->get_cur_state(cdev, ¤t_state); if (ret) - goto out_cdev_type; + current_state = ULONG_MAX; thermal_cooling_device_setup_sysfs(cdev); @@ -953,7 +961,8 @@ __thermal_cooling_device_register(struct device_node *np, return ERR_PTR(ret); } - thermal_debug_cdev_add(cdev, current_state); + if (current_state <= cdev->max_state) + thermal_debug_cdev_add(cdev, current_state); /* Add 'this' new cdev to the global cdev list */ mutex_lock(&thermal_list_lock);