Re: [PATCH v2] thermal: tegra-bpmp: Handle offline zones

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 3/30/23 15:36, Daniel Lezcano wrote:
On 30/03/2023 12:06, Mikko Perttunen wrote:
On 3/30/23 13:03, Daniel Lezcano wrote:
On 30/03/2023 11:49, Mikko Perttunen wrote:
From: Mikko Perttunen <mperttunen@xxxxxxxxxx>

Thermal zones located in power domains may not be accessible when
the domain is powergated. In this situation, reading the temperature
will return -BPMP_EFAULT. When evaluating trips, BPMP will internally
use -256C as the temperature for offline zones.

For smooth operation, for offline zones, return -EAGAIN when reading
the temperature and allow registration of zones even if they are
offline during probe.

I think it makes more sense to check if the power domain associated with the device is powered up and if not return -EPROBE_DEFER.

The power domains in question are related to computer vision engines that only get powered on when in use, possibly never if the user doesn't run a computer vision workload on the system. We still want other thermal zones to be available.

Ok, I see the point.

I'm worried about the semantic of the errors returned, the translation from BPMP_EFAULT to EAGAIN and the assumption it is a disabled (may be forever) thermal zone.

What does the documentation say for the error msg.rx.ret == -BPMP_EFAULT?


The documentation says

Value          | Description
-------------- | -----------------------------------------
0              | Temperature query succeeded.
-#BPMP_EINVAL  | Invalid request parameters.
-#BPMP_ENOENT  | No driver registered for thermal zone.
-#BPMP_EFAULT  | Problem reading temperature measurement.

In practice, what BPMP_EFAULT means here is that the hardware has no indicated temperature for the zone, which really only happens if the power domain is powered off.

Mikko


Signed-off-by: Mikko Perttunen <mperttunen@xxxxxxxxxx>
---
v2:
* Adjusted commit message.
* Patch 2/2 dropped for now since it is more controversial,
   and this patch is more critical.

  drivers/thermal/tegra/tegra-bpmp-thermal.c | 9 ++++++++-
  1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/drivers/thermal/tegra/tegra-bpmp-thermal.c b/drivers/thermal/tegra/tegra-bpmp-thermal.c
index f5fd4018f72f..4ffc3bb3bf35 100644
--- a/drivers/thermal/tegra/tegra-bpmp-thermal.c
+++ b/drivers/thermal/tegra/tegra-bpmp-thermal.c
@@ -52,6 +52,8 @@ static int __tegra_bpmp_thermal_get_temp(struct tegra_bpmp_thermal_zone *zone,
      err = tegra_bpmp_transfer(zone->tegra->bpmp, &msg);
      if (err)
          return err;
+    if (msg.rx.ret == -BPMP_EFAULT)
+        return -EAGAIN;
      if (msg.rx.ret)
          return -EINVAL;
@@ -259,7 +261,12 @@ static int tegra_bpmp_thermal_probe(struct platform_device *pdev)
          zone->tegra = tegra;
          err = __tegra_bpmp_thermal_get_temp(zone, &temp);
-        if (err < 0) {
+
+        /*
+         * Sensors in powergated domains may temporarily fail to be read +         * (-EAGAIN), but will become accessible when the domain is powered on.
+         */
+        if (err < 0 && err != -EAGAIN) {
              devm_kfree(&pdev->dev, zone);
              continue;
          }







[Index of Archives]     [ARM Kernel]     [Linux ARM]     [Linux ARM MSM]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux