Re: [PATCH v8 01/29] thermal/core: Add a generic thermal_zone_get_trip() function

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




Hi Ido,

On 12/03/2023 13:14, Ido Schimmel wrote:
On Mon, Oct 03, 2022 at 11:25:34AM +0200, Daniel Lezcano wrote:
@@ -1252,9 +1319,10 @@ thermal_zone_device_register_with_trips(const char *type, struct thermal_trip *t
  		goto release_device;
for (count = 0; count < num_trips; count++) {
-		if (tz->ops->get_trip_type(tz, count, &trip_type) ||
-		    tz->ops->get_trip_temp(tz, count, &trip_temp) ||
-		    !trip_temp)
+		struct thermal_trip trip;
+
+		result = thermal_zone_get_trip(tz, count, &trip);
+		if (result)
  			set_bit(count, &tz->trips_disabled);
  	}

Daniel, this change makes it so that trip points with a temperature of
zero are no longer disabled. This behavior was originally added in
commit 81ad4276b505 ("Thermal: Ignore invalid trip points"). The mlxsw
driver relies on this behavior - see mlxsw_thermal_module_trips_reset()
- and with this change I see that the thermal subsystem tries to
repeatedly set the state of the associated cooling devices to the
maximum state. Other drivers might also be affected by this.

Following patch solves the problem for me:

diff --git a/drivers/thermal/thermal_core.c b/drivers/thermal/thermal_core.c
index 55679fd86505..b50931f84aaa 100644
--- a/drivers/thermal/thermal_core.c
+++ b/drivers/thermal/thermal_core.c
@@ -1309,7 +1309,7 @@ thermal_zone_device_register_with_trips(const char *type, struct thermal_trip *t
                 struct thermal_trip trip;
result = thermal_zone_get_trip(tz, count, &trip);
-               if (result)
+               if (result || !trip.temperature)
                         set_bit(count, &tz->trips_disabled);
         }

Should I submit it or do you have a better idea?

Thanks for reporting this, I think the fix you are proposing is correct regarding the previous behavior.

However, I disagree with the commit 81ad4276b505, because it defines the zero as an invalid trip point. But some platforms have warming devices, when the temperature is too cold, eg 0°C, we enable the warming device in order to stay in the functioning temperature range.

Other devices can do the same with negative temperature values.

This feature is not yet upstream and the rework of the trip point should allow proper handling of cold trip points.

If you can send the change to fix the regression that would be great.

But keep in mind, the driver is assuming an internal thermal framework behavior. The trips_disabled is only to overcome a trip point description bug and you should not rely on it as well as not changing the trip points on the fly after they are registered.

Actually, the mlxsw driver should just build a valid array of trip points without 0°C trip point and pass it to thermal_zone_device_register_with_trips(). That would be a proper change without relying on a side effect of the thermal trip bug 0°C workaround.



--
<http://www.linaro.org/> Linaro.org │ Open source software for ARM SoCs

Follow Linaro:  <http://www.facebook.com/pages/Linaro> Facebook |
<http://twitter.com/#!/linaroorg> Twitter |
<http://www.linaro.org/linaro-blog/> Blog




[Index of Archives]     [ARM Kernel]     [Linux ARM]     [Linux ARM MSM]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux