[tip:x86/urgent] x86: mce: Fix thermal throttling message storm

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Commit-ID:  b417c9fd8690637f0c91479435ab3e2bf450c038
Gitweb:     http://git.kernel.org/tip/b417c9fd8690637f0c91479435ab3e2bf450c038
Author:     Ingo Molnar <mingo@xxxxxxx>
AuthorDate: Tue, 22 Sep 2009 15:50:24 +0200
Committer:  Ingo Molnar <mingo@xxxxxxx>
CommitDate: Tue, 22 Sep 2009 17:30:45 +0200

x86: mce: Fix thermal throttling message storm

If a system switches back and forth between hot and cold mode,
the MCE code will print a stream of critical kernel messages.

Extend the throttling code to properly notice this, by
only printing the first hot + cold transition and omitting
the rest up to CHECK_INTERVAL (5 minutes).

This way we'll only get a single incident of:

 [  102.356584] CPU0: Temperature above threshold, cpu clock throttled (total events = 1)
 [  102.357000] Disabling lock debugging due to kernel taint
 [  102.369223] CPU0: Temperature/speed normal

Every 5 minutes. The 'total events' count tells the number of cold/hot
transitions detected, should overheating occur after 5 minutes again:

[  402.357580] CPU0: Temperature above threshold, cpu clock throttled (total events = 24891)
[  402.358001] CPU0: Temperature/speed normal
[  450.704142] Machine check events logged

Cc: Hidetoshi Seto <seto.hidetoshi@xxxxxxxxxxxxxx>
Cc: Huang Ying <ying.huang@xxxxxxxxx>
Cc: Andi Kleen <ak@xxxxxxxxxxxxxxx>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@xxxxxxx>


---
 arch/x86/kernel/cpu/mcheck/therm_throt.c |    6 ++++--
 1 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kernel/cpu/mcheck/therm_throt.c b/arch/x86/kernel/cpu/mcheck/therm_throt.c
index db80b57..b3a1dba 100644
--- a/arch/x86/kernel/cpu/mcheck/therm_throt.c
+++ b/arch/x86/kernel/cpu/mcheck/therm_throt.c
@@ -42,6 +42,7 @@ struct thermal_state {
 
 	u64			next_check;
 	unsigned long		throttle_count;
+	unsigned long		last_throttle_count;
 };
 
 static DEFINE_PER_CPU(struct thermal_state, thermal_state);
@@ -120,11 +121,12 @@ static int therm_throt_process(bool is_throttled)
 	if (is_throttled)
 		state->throttle_count++;
 
-	if (!(was_throttled ^ is_throttled) &&
-			time_before64(now, state->next_check))
+	if (time_before64(now, state->next_check) &&
+			state->throttle_count != state->last_throttle_count)
 		return 0;
 
 	state->next_check = now + CHECK_INTERVAL;
+	state->last_throttle_count = state->throttle_count;
 
 	/* if we just entered the thermal event */
 	if (is_throttled) {
--
To unsubscribe from this list: send the line "unsubscribe linux-tip-commits" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Stable Commits]     [Linux Stable Kernel]     [Linux Kernel]     [Linux USB Devel]     [Linux Video &Media]     [Linux Audio Users]     [Yosemite News]     [Linux SCSI]

  Powered by Linux