Re: Interpretation of a hardware error

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]



On Thursday 12 April 2012 13.36.03 m.roth@xxxxxxxxx wrote:
> Hey, folks,
> 
> I've just started seeing
> Apr 12 13:09:59 <server> kernel: [Hardware Error]:
> MC4_STATUS[Over|CE|MiscV|-|AddrV|-|Poison|CECC]: 0xdd0accf2001d011b
> Apr 12 13:09:59 <server> kernel: [Hardware Error]: Northbridge Error (node
> 1, core 1): ECC error in L3 cache tag.

The error message certainly points to the CPU. The fact that the error 
happened on cache tag, not cache data further implicates the CPU.

The message is quite specific and I'd say rather trustworthy...

But there's also the possibility that the message is wrong (either something 
else went wrong or nothing really went wrong). In my experience hardware fault 
error messages are quite unreliable and at the end of the day DIMMs are 
magnitudes more likely to fail than CPUs...

/Peter

Attachment: signature.asc
Description: This is a digitally signed message part.

_______________________________________________
CentOS mailing list
CentOS@xxxxxxxxxx
http://lists.centos.org/mailman/listinfo/centos

[Index of Archives]     [CentOS]     [CentOS Announce]     [CentOS Development]     [CentOS ARM Devel]     [CentOS Docs]     [CentOS Virtualization]     [Carrier Grade Linux]     [Linux Media]     [Asterisk]     [DCCP]     [Netdev]     [Xorg]     [Linux USB]
  Powered by Linux