On Mon, Jul 1, 2024 at 5:13 PM Daniel Lezcano <daniel.lezcano@xxxxxxxxxx> wrote: > > On 28/06/2024 15:56, Rafael J. Wysocki wrote: > > On Thu, Jun 27, 2024 at 10:55 AM Daniel Lezcano > > <daniel.lezcano@xxxxxxxxxx> wrote: > >> > >> Currently the thermal framework has 4 trip point types: > >> > >> - active : basically for fans (or anything requiring energy to cool > >> down) > >> > >> - passive : a performance limiter > >> > >> - hot : for a last action before reaching critical > >> > >> - critical : a without return threshold leading to a system shutdown > >> > >> A thermal zone monitors the temperature regarding these trip > >> points. The old way to do that is actively polling the temperature > >> which is very bad for embedded systems, especially mobile and it is > >> even worse today as we can have more than fifty thermal zones. The > >> modern way is to rely on the driver to send an interrupt when the trip > >> points are crossed, so the system can sleep while the temperature > >> monitoring is offloaded to a dedicated hardware. > >> > >> However, the thermal aspect is also managed from userspace to protect > >> the user, especially tracking down the skin temperature sensor. The > >> logic is more complex than what we found in the kernel because it > >> needs multiple sources indicating the thermal situation of the entire > >> system. > >> > >> For this reason it needs to setup trip points at different levels in > >> order to get informed about what is going on with some thermal zones > >> when running some specific application. > >> > >> For instance, the skin temperature must be limited to 43°C on a long > >> run but can go to 48°C for 10 minutes, or 60°C for 1 minute. > >> > >> The thermal engine must then rely on trip points to monitor those > >> temperatures. Unfortunately, today there is only 'active' and > >> 'passive' trip points which has a specific meaning for the kernel, not > >> the userspace. That leads to hacks in different platforms for mobile > >> and embedded systems where 'active' trip points are used to send > >> notification to the userspace. This is obviously not right because > >> these trip are handled by the kernel. > >> > >> This patch introduces the 'user' trip point type where its semantic is > >> simple: do nothing at the kernel level, just send a notification to > >> the user space. > >> > >> Signed-off-by: Daniel Lezcano <daniel.lezcano@xxxxxxxxxx> > >> --- > >> .../devicetree/bindings/thermal/thermal-zones.yaml | 1 + > >> drivers/thermal/thermal_core.c | 8 ++++++++ > >> drivers/thermal/thermal_of.c | 1 + > >> drivers/thermal/thermal_trace.h | 4 +++- > >> drivers/thermal/thermal_trip.c | 1 + > >> include/uapi/linux/thermal.h | 1 + > >> 6 files changed, 15 insertions(+), 1 deletion(-) > >> > >> diff --git a/Documentation/devicetree/bindings/thermal/thermal-zones.yaml b/Documentation/devicetree/bindings/thermal/thermal-zones.yaml > >> index 68398e7e8655..cb9ea54a192e 100644 > >> --- a/Documentation/devicetree/bindings/thermal/thermal-zones.yaml > >> +++ b/Documentation/devicetree/bindings/thermal/thermal-zones.yaml > >> @@ -153,6 +153,7 @@ patternProperties: > >> type: > >> $ref: /schemas/types.yaml#/definitions/string > >> enum: > >> + - user # enable user notification > >> - active # enable active cooling e.g. fans > >> - passive # enable passive cooling e.g. throttling cpu > >> - hot # send notification to driver > >> diff --git a/drivers/thermal/thermal_core.c b/drivers/thermal/thermal_core.c > >> index 2aa04c46a425..506f880d9aa9 100644 > >> --- a/drivers/thermal/thermal_core.c > >> +++ b/drivers/thermal/thermal_core.c > >> @@ -734,6 +734,14 @@ int thermal_bind_cdev_to_trip(struct thermal_zone_device *tz, > >> if (tz != pos1 || cdev != pos2) > >> return -EINVAL; > >> > >> + /* > >> + * It is not allowed to bind a cooling device with a trip > >> + * point user type because no mitigation should happen from > >> + * the kernel with these trip points > >> + */ > >> + if (trip->type == THERMAL_TRIP_USER) > >> + return -EINVAL; > > > > Maybe print a debug message when bailing out here? > > After thinking a bit about the message, it sounds to me that is a really > an error in the firmware if we end up binding an 'user' trip point. > > What about the following message: > > dev_err(tz->device, "Trying to bind the cooling device '%s' with an > 'user' trip point id=%d", cdev->type, trip->id); s/an// I think. Also I wouldn't use dev_err() as it indicates a kernel issue. Maybe dev_info(tz->device, FW_BUG ...)?