Re: [PATCH] x86: Downgrade clock throttling thermal event critical error

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Quoting Tvrtko Ursulin (2018-10-10 12:59:59)
> 
> On 09/10/2018 12:37, Chris Wilson wrote:
> > Under CI testing, it is common for the cpus to overheat with the
> > continuous workloads and end up being throttled. As the cpus still
> > function, it is less of a critical error meriting urgent action, but an
> > expected yet significant condition (pr_note).
> > 
> > Signed-off-by: Chris Wilson <chris@xxxxxxxxxxxxxxxxxx>
> > Cc: Petri Latvala <petri.latvala@xxxxxxxxx>
> > ---
> >   arch/x86/kernel/cpu/mcheck/therm_throt.c | 8 ++++----
> >   1 file changed, 4 insertions(+), 4 deletions(-)
> > 
> > diff --git a/arch/x86/kernel/cpu/mcheck/therm_throt.c b/arch/x86/kernel/cpu/mcheck/therm_throt.c
> > index 2da67b70ba98..bc57b5988589 100644
> > --- a/arch/x86/kernel/cpu/mcheck/therm_throt.c
> > +++ b/arch/x86/kernel/cpu/mcheck/therm_throt.c
> > @@ -184,10 +184,10 @@ static void therm_throt_process(bool new_event, int event, int level)
> >       /* if we just entered the thermal event */
> >       if (new_event) {
> >               if (event == THERMAL_THROTTLING_EVENT)
> > -                     pr_crit("CPU%d: %s temperature above threshold, cpu clock throttled (total events = %lu)\n",
> > -                             this_cpu,
> > -                             level == CORE_LEVEL ? "Core" : "Package",
> > -                             state->count);
> > +                     pr_notice("CPU%d: %s temperature above threshold, cpu clock throttled (total events = %lu)\n",
> > +                               this_cpu,
> > +                               level == CORE_LEVEL ? "Core" : "Package",
> > +                               state->count);
> >               return;
> >       }
> >       if (old_event) {
> > 
> 
> It even sounds it wouldn't be far fetched to argue these days notice is 
> the correct log level for thermal throttling. Unless there are more 
> sources of throttling messages. TBC when I get back to my Skull Canyon. 
> That one certainly logs something like this shortly after invoking make -j8.

I was thinking of tarting up the language to say most processors
nowadays can easily exceed their Thermal Design Point and are built with
that in mind. The caveat is making sure that the shutdown limit is still
reported as a critical event, iirc that comes as a MCE.
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@xxxxxxxxxxxxxxxxxxxxx
https://lists.freedesktop.org/mailman/listinfo/intel-gfx




[Index of Archives]     [AMD Graphics]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux