Re: [PATCH v1 12/17] platform/x86: acerhdf: Use the .should_bind() thermal zone callback

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Peter,

On Mon, Aug 12, 2024 at 6:15 PM Rafael J. Wysocki <rafael@xxxxxxxxxx> wrote:
>
> On Mon, Aug 12, 2024 at 4:56 PM Rafael J. Wysocki <rafael@xxxxxxxxxx> wrote:
> >
> > On Tue, Aug 6, 2024 at 12:10 AM Peter Kästle <peter@xxxxxxxx> wrote:
> > >
> > > Hi Rafael,
> > >
> > > On 01.08.24 12:14, Rafael J. Wysocki wrote:
> > > > Hi Peter,
> > > >
> > > > On Wed, Jul 31, 2024 at 10:50 PM Peter Kästle <xypiie@xxxxxxxxx> wrote:
> > > >>
> > > >> Hi Rafael,
> > > >>
> > > >> On 30.07.24 20:33, Rafael J. Wysocki wrote:
> > > >>> From: Rafael J. Wysocki <rafael.j.wysocki@xxxxxxxxx>
> > > >>>
> > > >>> Make the acerhdf driver use the .should_bind() thermal zone
> > > >>> callback to provide the thermal core with the information on whether or
> > > >>> not to bind the given cooling device to the given trip point in the
> > > >>> given thermal zone.  If it returns 'true', the thermal core will bind
> > > >>> the cooling device to the trip and the corresponding unbinding will be
> > > >>> taken care of automatically by the core on the removal of the involved
> > > >>> thermal zone or cooling device.
> > > >>>
> > > >>> The previously existing acerhdf_bind() function bound cooling devices
> > > >>> to thermal trip point 0 only, so the new callback needs to return 'true'
> > > >>> for trip point 0.  However, it is straightforward to observe that trip
> > > >>> point 0 is an active trip point and the only other trip point in the
> > > >>> driver's thermal zone is a critical one, so it is sufficient to return
> > > >>> 'true' from that callback if the type of the given trip point is
> > > >>> THERMAL_TRIP_ACTIVE.
> > > >>>
> > > >>> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@xxxxxxxxx>
> > > >>
> > > >> Thanks for including me on the review.
> > > >> I'm working on it, but unfortunately the refactoring of the thermal layer
> > > >> around gov_bang_bang.c earlier this year broke acerhdf.
> > > >
> > > > Well, sorry about that.
> > >
> > > I already fixed the main problem, which caused full malfunction of acerhdf:
> > >
> > > The new functionality of .trip_crossed in the gov_bang_bang is missing an
> > > initial check, whether the temperature is below the fanoff temperature
> > > (trip_point.temperature - hysteresis) and by that it did not turn the
> > > fan off.
> >
> > So IIUC this is when the fan starts in the "on" state and the thermal
> > core is expected to turn it off when the zone temperature is not in
> > fact above the trip point low temperature.
> >
> > > This then caused that the system will never heat up as much to
> > > cross the upper temperature. As a consequence it could never cross the
> > > lower temperature to turn the fan off. -> Fan was locked always on.
> > > And that's obviously not what we want.
> >
> > Sure.
> >
> > > As I didn't find any API call, to ask the governor doing that for me, I
> > > added an "acerhdf_init_fan()" functionality into acerhdf init function right
> > > after registering the thermal zone (and on resume from suspend) which turns
> > > the fan off if the temperature is lower than the fanoff parameter.
> > > Probably not the nicest solution, but maybe the most pragmatic one without
> > > touching the thermal layer.
> >
> > Well, this issue may not be limited to the acerhdf case, so it may be
> > good to address it in the thermal core.  There is kind of a
> > chicken-and-egg situation in there, however, because the cooling
> > device state is determined by the governor which only runs when it is
> > called, but the core doesn't know that the governor should be invoked
> > when there are no trip point crossing events.
> >
> > It may just be a matter of adding an ->update_tz() callback to the
> > bang-bang governor, let me see.

This isn't the right approach because .update_tz() is called before
checking the zone temperature for the first time, but to remedy the
issue at hand, code needs to run when the zone temperature is known to
the thermal core.

This means that the Bang-bang governor needs a .manage() callback in
addition to the .trip_crossed() one, but that callback will only need
to check if the states of cooling devices bound to the trip points
below the zone temperature need to be adjusted, and just once.

So something like in the attached patch.
---
 drivers/thermal/gov_bang_bang.c |   64 ++++++++++++++++++++++++++++++----------
 1 file changed, 49 insertions(+), 15 deletions(-)

Index: linux-pm/drivers/thermal/gov_bang_bang.c
===================================================================
--- linux-pm.orig/drivers/thermal/gov_bang_bang.c
+++ linux-pm/drivers/thermal/gov_bang_bang.c
@@ -13,6 +13,28 @@
 
 #include "thermal_core.h"
 
+static void bang_bang_set_instance_target(struct thermal_instance *instance,
+					  unsigned int target)
+{
+	if (instance->target != 0 && instance->target != 1 &&
+	    instance->target != THERMAL_NO_TARGET)
+		pr_debug("Unexpected state %ld of thermal instance %s in bang-bang\n",
+			 instance->target, instance->name);
+
+	/*
+	 * Enable the fan when the trip is crossed on the way up and disable it
+	 * when the trip is crossed on the way down.
+	 */
+	instance->target = target;
+	instance->initialized = true;
+
+	dev_dbg(&instance->cdev->device, "target=%ld\n", instance->target);
+
+	mutex_lock(&instance->cdev->lock);
+	instance->cdev->updated = false; /* cdev needs update */
+	mutex_unlock(&instance->cdev->lock);
+}
+
 /**
  * bang_bang_control - controls devices associated with the given zone
  * @tz: thermal_zone_device
@@ -54,25 +76,36 @@ static void bang_bang_control(struct the
 		tz->temperature, trip->hysteresis);
 
 	list_for_each_entry(instance, &tz->thermal_instances, tz_node) {
-		if (instance->trip != trip)
-			continue;
+		if (instance->trip == trip)
+			bang_bang_set_instance_target(instance, crossed_up);
+	}
+}
 
-		if (instance->target != 0 && instance->target != 1 &&
-		    instance->target != THERMAL_NO_TARGET)
-			pr_debug("Unexpected state %ld of thermal instance %s in bang-bang\n",
-				 instance->target, instance->name);
-
-		/*
-		 * Enable the fan when the trip is crossed on the way up and
-		 * disable it when the trip is crossed on the way down.
-		 */
-		instance->target = crossed_up;
-
-		dev_dbg(&instance->cdev->device, "target=%ld\n", instance->target);
-
-		mutex_lock(&instance->cdev->lock);
-		instance->cdev->updated = false; /* cdev needs update */
-		mutex_unlock(&instance->cdev->lock);
+static void bang_bang_manage(struct thermal_zone_device *tz)
+{
+	const struct thermal_trip_desc *td;
+	struct thermal_instance *instance;
+
+	for_each_trip_desc(tz, td) {
+		const struct thermal_trip *trip = &td->trip;
+
+		if (tz->temperature >= td->threshold ||
+		    trip->temperature == THERMAL_TEMP_INVALID ||
+		    trip->type == THERMAL_TRIP_CRITICAL ||
+		    trip->type == THERMAL_TRIP_HOT)
+			continue;
+
+		/*
+		 * If the initial cooling device state is "on", but the zone
+		 * temperature is not above the trip point, the core will not
+		 * call bang_bang_control() until the zone temperature reaches
+		 * the trip point temperature which may be never.  In those
+		 * cases, set the initial state of the cooling device to 0.
+		 */
+		list_for_each_entry(instance, &tz->thermal_instances, tz_node) {
+			if (!instance->initialized && instance->trip == trip)
+				bang_bang_set_instance_target(instance, 0);
+		}
 	}
 
 	list_for_each_entry(instance, &tz->thermal_instances, tz_node)
@@ -82,5 +115,6 @@ static void bang_bang_control(struct the
 static struct thermal_governor thermal_gov_bang_bang = {
 	.name		= "bang_bang",
 	.trip_crossed	= bang_bang_control,
+	.manage		= bang_bang_manage,
 };
 THERMAL_GOVERNOR_DECLARE(thermal_gov_bang_bang);

[Index of Archives]     [Linux Kernel Development]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux