Re: Machine check events

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]



On further, further, further toying, I now have mcelog running on my 32-bit
CentOS 6 systems! I admit to doing it the "dumb" way: I grabbed the source
from the git repository, compiled and installed it, and THEN discovered
that the init.d file supplied with the source was not CentOS compatible, so
I grabbed the x86-64 RPM, extracted the startup files, and copied them into
place. The RPM was small enough to make this easy.

What I SHOULD have done is to grab the source RPM, replace the source with
the latest source, build and install the source RPM, and then repackage the
RPMs again for future consumption.  Maybe I will try that at a future date, but
I don't really have time today.

-G.

On Nov 26, 2013, at 11:11 AM, Glenn Eychaner <geychaner@xxxxxxx> wrote:

> On further, further investigation, it looks like according to the mcelog install
> guide at http://www.mcelog.org/installation.html, I could "roll my own" for 32-bit
> CentOS 6:
> 
> "For bad page offlining you will need a 2.6.33+ kernel or a 2.6.32 kernel with
> the soft offlining capability backported (like RHEL6 or SLES11-SP1)"
> "The kernel has to have CONFIG_X86_MCE enabled. For 32bit kernels you
> need at least a 2.6,30 kernel."
> 
> The current kernel I am running is 2.6.32-358.23.2, but I can't tell whether it
> has CONFIG_X86_MCE enabled. How can I find this out?
> 
> JD writes:
> 
>> yum info mcelog
>> ...
>> Description : mcelog is a daemon that collects and decodes Machine Check
>>            : Exception data on x86-64 machines.
>> 
>> So not for 32-bit...
> 
> On Nov 26, 2013, at 9:25 AM, Glenn Eychaner <geychaner@xxxxxxx> wrote:
> 
>> Further investigation seems to indicate that these events should be handled
>> by "mcelog" or "mced". However, there is no /var/log/mcelog, nor do I have a
>> "mcelog" or "mced" binary, nor does yum seem to contain anything related
>> (based on "yum whatprovides '*/mcelog'" and similar queries).
>> 
>> Thus, I still don't know what to do with these errors.  Ignore them? I am
>> running 32-bit CentOS 6.4 (legacy software reasons).
>> 
>> On Nov 25, 2013, at 11:05 AM, Glenn Eychaner <geychaner@xxxxxxx> wrote:
>> 
>>> On my new Haswell-based machines, I am occasionally seeing entries like the
>>> following in /var/log/messages:
>>> 	kernel: [Hardware Error]: Machine check events logged
>>> (I would not have even noticed them, except that they get flagged by logwatch.)
>>> These messages always occur alone, and don't seem to have a corresponding
>>> entry in any other log file in /var/log. How can I get more info about these
>>> messages?

--
Glenn Eychaner (geychaner@xxxxxx)
Telescope Systems Programmer, Las Campanas Observatory




_______________________________________________
CentOS mailing list
CentOS@xxxxxxxxxx
http://lists.centos.org/mailman/listinfo/centos




[Index of Archives]     [CentOS]     [CentOS Announce]     [CentOS Development]     [CentOS ARM Devel]     [CentOS Docs]     [CentOS Virtualization]     [Carrier Grade Linux]     [Linux Media]     [Asterisk]     [DCCP]     [Netdev]     [Xorg]     [Linux USB]
  Powered by Linux