Re: [RFC PATCH] thermal: Schedule a backup thermal shutdown workqueue after a known period of time to tackle failed poweroff

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Nishanth,

On Monday 28 December 2015 11:11 PM, Nishanth Menon wrote:
On 12/20/2015 11:46 PM, Keerthy wrote:

+linux-pm.

Thanks for looping!


In few rare conditions like during boot up the orderly_poweroff
function might not be able to complete fully leaving the device
running at dangerously high temperatures. Hence adding a backup
workqueue to act after a known period of time and poweroff the device.



Suggested-by: Nishanth Menon <nm@xxxxxx>
Reported-by: Nishanth Menon <nm@xxxxxx>

The specific case I hit was a critical temperature event as soon as
the hwmon device was probed (the driver tmp102 was a kernel module).

Signed-off-by: Keerthy <j-keerthy@xxxxxx>
---
  drivers/thermal/Kconfig        |  9 +++++++++
  drivers/thermal/thermal_core.c | 26 ++++++++++++++++++++++++++
  2 files changed, 35 insertions(+)

diff --git a/drivers/thermal/Kconfig b/drivers/thermal/Kconfig
index 8cc4ac6..25584ee 100644
--- a/drivers/thermal/Kconfig
+++ b/drivers/thermal/Kconfig
@@ -92,6 +92,15 @@ config THERMAL_DEFAULT_GOV_POWER_ALLOCATOR

  endchoice

+config THERMAL_BKUP_SHUTDOWN_DELAY_MS
+        int "Thermal shutdown  backup delay in milli-seconds"
+        depends on THERMAL
+        default "5000"
+        ---help---
+        The number of milliseconds to delay after orderly_poweroff call.

Probably needs a rephrase.

Delay in milliseconds before the backup thermal shutdown is triggered.


+
+        Default: 5000 (5 seconds)
+
  config THERMAL_GOV_FAIR_SHARE
  	bool "Fair-share thermal governor"
  	help
diff --git a/drivers/thermal/thermal_core.c b/drivers/thermal/thermal_core.c
index d9e525c..b793857 100644
--- a/drivers/thermal/thermal_core.c
+++ b/drivers/thermal/thermal_core.c
@@ -61,6 +61,12 @@ static DEFINE_MUTEX(thermal_governor_lock);

  static struct thermal_governor *def_governor;

+#ifdef CONFIG_THERMAL_BKUP_SHUTDOWN_DELAY_MS
+#define BKUP_SHUTDOWN_DELAY CONFIG_THERMAL_BKUP_SHUTDOWN_DELAY_MS
+#else
+#define BKUP_SHUTDOWN_DELAY 5000
+#endif
+

I am not sure if this #ifdeffery is even needed.


Eduardo, Rui: If this is not the suggested technique, maybe you guys
could suggest how we could handle a case where userspace might be
hungup due to some reason and a case where a critical temperature
event in the middle of device probe was triggered?

Obviously, we'd like to take into consideration userspace latencies as
well- but that is very specific to fs being run.. not really a simple
problem, IMHO..

  static struct thermal_governor *__find_governor(const char *name)
  {
  	struct thermal_governor *pos;
@@ -423,6 +429,18 @@ static void handle_non_critical_trips(struct thermal_zone_device *tz,
  		       def_governor->throttle(tz, trip);
  }

+static void bkup_shutdown_func(struct work_struct *unused);
+static DECLARE_DELAYED_WORK(bkup_shutdown_work, bkup_shutdown_func);
+
+static void bkup_shutdown_func(struct work_struct *work)
+{
+	pr_warn("Orderly_poweroff has failed! Attempting kernel_power_off\n");
+	kernel_power_off();
+
+	pr_warn("kernel_power_off has failed! Attempting emergency_restart\n");
+	emergency_restart();

I think documentation is necessary that we are hoping for bootloader
to be able to detect and halt as needed -> risk here obviously is an
infinite reboot loop :( .

Agreed.



+}
+
  static void handle_critical_trips(struct thermal_zone_device *tz,
  				int trip, enum thermal_trip_type trip_type)
  {
@@ -443,6 +461,14 @@ static void handle_critical_trips(struct thermal_zone_device *tz,
  		dev_emerg(&tz->device,
  			  "critical temperature reached(%d C),shutting down\n",
  			  tz->temperature / 1000);
+		/**

needs to be kernel doc style?

+		 * This is a backup option to shutdown the
+		 * system in case orderly_poweroff
+		 * fails
Maybe adjust to 80char?

Okay.


+		 */
+		schedule_delayed_work(&bkup_shutdown_work,
+				      msecs_to_jiffies(BKUP_SHUTDOWN_DELAY));
+
  		orderly_poweroff(true);
  	}
  }




I will wait for Eduardo/Rui's inputs before posting the next version.

Best Regards,
Keerthy
--
To unsubscribe from this list: send the line "unsubscribe linux-omap" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux Arm (vger)]     [ARM Kernel]     [ARM MSM]     [Linux Tegra]     [Linux WPAN Networking]     [Linux Wireless Networking]     [Maemo Users]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite Trails]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux