Patch "mlxsw: core_thermal: Fix fan speed in maximum cooling state" has been added to the 6.2-stable tree

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



This is a note to let you know that I've just added the patch titled

    mlxsw: core_thermal: Fix fan speed in maximum cooling state

to the 6.2-stable tree which can be found at:
    http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=summary

The filename of the patch is:
     mlxsw-core_thermal-fix-fan-speed-in-maximum-cooling-.patch
and it can be found in the queue-6.2 subdirectory.

If you, or anyone else, feels it should not be added to the stable tree,
please let <stable@xxxxxxxxxxxxxxx> know about it.



commit bdb027a7b89c2f6bfcc23bef62276c7d39bf370f
Author: Ido Schimmel <idosch@xxxxxxxxxx>
Date:   Fri Mar 17 16:32:59 2023 +0100

    mlxsw: core_thermal: Fix fan speed in maximum cooling state
    
    [ Upstream commit 6d206b1ea9f48433a96edec7028586db1d947911 ]
    
    The cooling levels array is supposed to prevent the system fans from
    being configured below a 20% duty cycle as otherwise some of them get
    stuck at 0 RPM.
    
    Due to an off-by-one error, the last element in the array was not
    initialized, causing it to be set to zero, which in turn lead to fans
    being configured with a 0% duty cycle in maximum cooling state.
    
    Since commit 332fdf951df8 ("mlxsw: thermal: Fix out-of-bounds memory
    accesses") the contents of the array are static. Therefore, instead of
    fixing the initialization of the array, simply remove it and adjust
    thermal_cooling_device_ops::set_cur_state() so that the configured duty
    cycle is never set below 20%.
    
    Before:
    
     # cat /sys/class/thermal/thermal_zone0/cdev0/type
     mlxsw_fan
     # echo 10 > /sys/class/thermal/thermal_zone0/cdev0/cur_state
     # cat /sys/class/hwmon/hwmon0/name
     mlxsw
     # cat /sys/class/hwmon/hwmon0/pwm1
     0
    
    After:
    
     # cat /sys/class/thermal/thermal_zone0/cdev0/type
     mlxsw_fan
     # echo 10 > /sys/class/thermal/thermal_zone0/cdev0/cur_state
     # cat /sys/class/hwmon/hwmon0/name
     mlxsw
     # cat /sys/class/hwmon/hwmon0/pwm1
     255
    
    This bug was uncovered when the thermal subsystem repeatedly tried to
    configure the cooling devices to their maximum state due to another
    issue [1]. This resulted in the fans being stuck at 0 RPM, which
    eventually lead to the system undergoing thermal shutdown.
    
    [1] https://lore.kernel.org/netdev/ZA3CFNhU4AbtsP4G@shredder/
    
    Fixes: a421ce088ac8 ("mlxsw: core: Extend cooling device with cooling levels")
    Signed-off-by: Ido Schimmel <idosch@xxxxxxxxxx>
    Reviewed-by: Vadim Pasternak <vadimp@xxxxxxxxxx>
    Signed-off-by: Petr Machata <petrm@xxxxxxxxxx>
    Signed-off-by: David S. Miller <davem@xxxxxxxxxxxxx>
    Signed-off-by: Sasha Levin <sashal@xxxxxxxxxx>

diff --git a/drivers/net/ethernet/mellanox/mlxsw/core_thermal.c b/drivers/net/ethernet/mellanox/mlxsw/core_thermal.c
index c5240d38c9dbd..09ed6e5fa6c34 100644
--- a/drivers/net/ethernet/mellanox/mlxsw/core_thermal.c
+++ b/drivers/net/ethernet/mellanox/mlxsw/core_thermal.c
@@ -105,7 +105,6 @@ struct mlxsw_thermal {
 	struct thermal_zone_device *tzdev;
 	int polling_delay;
 	struct thermal_cooling_device *cdevs[MLXSW_MFCR_PWMS_MAX];
-	u8 cooling_levels[MLXSW_THERMAL_MAX_STATE + 1];
 	struct thermal_trip trips[MLXSW_THERMAL_NUM_TRIPS];
 	struct mlxsw_cooling_states cooling_states[MLXSW_THERMAL_NUM_TRIPS];
 	struct mlxsw_thermal_area line_cards[];
@@ -468,7 +467,7 @@ static int mlxsw_thermal_set_cur_state(struct thermal_cooling_device *cdev,
 		return idx;
 
 	/* Normalize the state to the valid speed range. */
-	state = thermal->cooling_levels[state];
+	state = max_t(unsigned long, MLXSW_THERMAL_MIN_STATE, state);
 	mlxsw_reg_mfsc_pack(mfsc_pl, idx, mlxsw_state_to_duty(state));
 	err = mlxsw_reg_write(thermal->core, MLXSW_REG(mfsc), mfsc_pl);
 	if (err) {
@@ -859,10 +858,6 @@ int mlxsw_thermal_init(struct mlxsw_core *core,
 		}
 	}
 
-	/* Initialize cooling levels per PWM state. */
-	for (i = 0; i < MLXSW_THERMAL_MAX_STATE; i++)
-		thermal->cooling_levels[i] = max(MLXSW_THERMAL_MIN_STATE, i);
-
 	thermal->polling_delay = bus_info->low_frequency ?
 				 MLXSW_THERMAL_SLOW_POLL_INT :
 				 MLXSW_THERMAL_POLL_INT;



[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux