On 2014-11-19 22:53, Rasmus Liland wrote: > On 2014-11-19 21:41, Mark Lee wrote: > > > > To Rasmus, > > > > Can you run the parts where it says "run the abvoe through mcelog > > --ascii" and post the contents? > > > > Regards, > > Mark > > > > I'm attaching the output of mcelog to this message. However, I'm unsure of > the usefulness of the output. > I checked dmesg now after having uptime of ... > rasmus@angrist ~ % uptime > 02:04:01 up 1 day, 7:35, 1 user, load average: 0.04, 0.15, 0.40 > rasmus@angrist ~ % uname -a > Linux angrist 3.11.5-1-ARCH #1 SMP PREEMPT Mon Oct 14 08:31:43 CEST 2013 > x86_64 GNU/Linux ... about 26 hours. It seems after about 19 hours some (possibly) temperature related were causing mce hardware errors over a ten minute interval: > [70133.209654] mce: [Hardware Error]: Machine check events logged > [70376.833053] CPU2: Core temperature above threshold, cpu clock throttled (total events = 30628) > [70376.833056] CPU3: Core temperature above threshold, cpu clock throttled (total events = 30628) > [70376.833061] CPU3: Package temperature above threshold, cpu clock throttled (total events = 174126) > [70376.833070] CPU2: Package temperature above threshold, cpu clock throttled (total events = 174126) > [70376.833074] CPU1: Package temperature above threshold, cpu clock throttled (total events = 174126) > [70376.833077] CPU0: Package temperature above threshold, cpu clock throttled (total events = 174124) > [70376.835060] CPU3: Core temperature/speed normal > [70376.835064] CPU2: Core temperature/speed normal > [70376.835070] CPU2: Package temperature/speed normal > [70376.835074] CPU3: Package temperature/speed normal > [70376.835087] CPU1: Package temperature/speed normal > [70376.835090] CPU0: Package temperature/speed normal > [70433.353800] mce: [Hardware Error]: Machine check events logged > [70676.969501] CPU2: Core temperature/speed normal > [70676.969505] CPU3: Core temperature/speed normal > [70676.969511] CPU0: Package temperature above threshold, cpu clock throttled (total events = 198545) > [70676.969516] CPU1: Package temperature above threshold, cpu clock throttled (total events = 198547) > [70676.969522] CPU3: Package temperature above threshold, cpu clock throttled (total events = 198547) > [70676.969545] CPU2: Package temperature above threshold, cpu clock throttled (total events = 198547) > [70676.970519] CPU0: Package temperature/speed normal > [70676.970522] CPU2: Package temperature/speed normal > [70676.970524] CPU3: Package temperature/speed normal > [70676.970526] CPU1: Package temperature/speed normal > [70733.497978] mce: [Hardware Error]: Machine check events logged As the system did not reboot, it were able to self heal. -- Rasmus Liland, jrl@xxxxxxxxxxxxx, jens.rasmus.liland@xxxxxxx
Attachment:
pgpLDtPxruvbr.pgp
Description: PGP signature