On Tue, Apr 12, 2016 at 3:15 PM, Chris Murphy <lists@xxxxxxxxxxxxxxxxx> wrote: > On Tue, Apr 12, 2016 at 2:41 PM, CS DBA <cs_dba@xxxxxxxxxxxxxxxxxxx> wrote: >> Hi all; >> >> Daily I see anywhere from 10 - 50 or more of these alerts via abrt. I'm >> running Fedora 23 (KDE Spin) fully up to date on a Lenovo X1 carbon 3rd gen. >> In all the alerts I see that the trip temp was exceeded, and then dropped >> below the trip temp within 1 second. >> Below is a sample of the output. >> >> Should I be concerned? If not can I disable the popup alert? If so, >> recommendations? should I consider cleaning & replacing the cpu thermal >> compound? >> >> Thanks in advance >> >> >> >> >> The kernel log indicates that hardware errors were detected. >> System log may have more information. >> The last 20 mcelog lines of system log are: >> ========================================== >> Apr 12 14:33:19 F23-host mcelog: Please check your system cooling. >> Performance will be impacted >> Apr 12 14:33:19 F23-host mcelog: STATUS 8812080b MCGSTATUS 0 >> Apr 12 14:33:19 F23-host mcelog: MCGCAP 1000c07 APICID 1 SOCKETID 0 >> Apr 12 14:33:19 F23-host mcelog: CPUID Vendor Intel Family 6 Model 61 >> Apr 12 14:33:19 F23-host mcelog: Hardware event. This is not a software >> error. >> Apr 12 14:33:19 F23-host mcelog: MCE 0 >> Apr 12 14:33:19 F23-host mcelog: CPU 1 THERMAL EVENT TSC 134e63eb3c2f >> Apr 12 14:33:19 F23-host mcelog: TIME 1460493199 Tue Apr 12 14:33:19 2016 >> Apr 12 14:33:19 F23-host mcelog: Processor 1 below trip temperature. >> Throttling disabled >> Apr 12 14:33:19 F23-host mcelog: STATUS 8813080a MCGSTATUS 0 >> Apr 12 14:33:19 F23-host mcelog: MCGCAP 1000c07 APICID 1 SOCKETID 0 >> Apr 12 14:33:19 F23-host mcelog: CPUID Vendor Intel Family 6 Model 61 >> Apr 12 14:33:19 F23-host mcelog: Hardware event. This is not a software >> error. >> Apr 12 14:33:19 F23-host mcelog: MCE 1 >> Apr 12 14:33:19 F23-host mcelog: CPU 0 THERMAL EVENT TSC 134e63eb83f6 >> Apr 12 14:33:19 F23-host mcelog: TIME 1460493199 Tue Apr 12 14:33:19 2016 >> Apr 12 14:33:19 F23-host mcelog: Processor 0 below trip temperature. >> Throttling disabled >> Apr 12 14:33:19 F23-host mcelog: STATUS 8813080a MCGSTATUS 0 >> Apr 12 14:33:19 F23-host mcelog: MCGCAP 1000c07 APICID 0 SOCKETID 0 >> Apr 12 14:33:19 F23-host mcelog: CPUID Vendor Intel Family 6 Model 61 > > I think any MCE event shoudn't be ignored. There's just no way to know > if it's bogus or not. I get these on my machine periodically to no ill > effect and there's an upstream kernel bug with no response for years. > I thought it should be true that the hardware itself won't allow an > overheat, either GPU or CPU, and yet these messages suggest otherwise. > > What I've been doing is using thermald which you can get from copr. > https://copr.fedorainfracloud.org/coprs/hadrons123/thermald/ > > However, what I just realized is that it says that 1.5.3 version build > has failed, and yet rpm -q shows 1.5.3 is on my system. And 'thermald > --version' also shows it's 1.5.3 and it's running. But it doesn't seem > to be working or producing the same messages it used to, where it'd > throttle the CPU automatically. Hmmm any wonder why it's seems hotter > than usual. Crap! Well it appears to be doing something still, if I relaunch it in debug mode. I guess maybe the new build is just not as verbose as the previous build, by default. What I still don't get is how the copr build state is failed, and yet I have that same build installed. -- Chris Murphy -- users mailing list users@xxxxxxxxxxxxxxxxxxxxxxx To unsubscribe or change subscription options: http://lists.fedoraproject.org/admin/lists/users@xxxxxxxxxxxxxxxxxxxxxxx Fedora Code of Conduct: http://fedoraproject.org/code-of-conduct Guidelines: http://fedoraproject.org/wiki/Mailing_list_guidelines Have a question? Ask away: http://ask.fedoraproject.org