Re: ipmi regression in 4.5?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]



Gavin Carr wrote:
On Thu, Jun 21, 2007 at 10:13:56AM +0100, James Pearson wrote:

Gavin Carr wrote:

I've been monitoring CPU temperature on a few Dell SC1435s running CentOS4
via OpenIPMI and 'ipmitool sdr'. It's been working very nicely, but the
upgrade to 4.5 not so long ago seems to have broken something:

# ipmitool sdr type Temperature
Temp             | 01h | ns  |  3.1 | Disabled
Planar Temp      | 04h | ok  |  7.1 | 30 degrees C
Temp Interface   | 53h | ns  |  7.1 | Disabled

The disabled sensors above used to work fine, and there have been no config
changes or bios upgrades or anything. All machines affected post 4.5.

I had a similar problem with Dell boxes when I went from ipmitool v1.8.8 to v1.8.9 - see the thread starting at:

<http://www.mail-archive.com/ipmitool-devel@xxxxxxxxxxxxxxxxxxxxx/msg00468.html>

It looks like the patch for ipmitool in the CentOS 4.5 OpenIPMI SRPM i.e. ipmitool-1.8.8-disabled-sensor.patch is the cause of this issue ... the comment is the change log is:

- Added patch to fix sensors problems on Woodcrest (#228679)

I guess you could rebuild the OpenIPMI without that patch


Thanks for the input James.

That does seem a similar problem, but it's specific to those Intel chipsets,
but the looks. The SC1435s we're I'm seeing the problem are AMDs.

Another interesting datapoint I've discovered is that the versions of OpenIPMI only changed at the release level:

  CentOS 4.4: 1.4.14-1.4E.13
  CentOS 4.5: 1.4.14-1.4E.17

so I'm starting to wonder if it's perhaps a kernel change.

In addition, I've now verified that the sensors are behaving similarly
on CentOS 5.


The ipmitool-1.8.8-disabled-sensor.patch may well be to fix Woodcrest specific issues - but it also removes part of the code that affects temperature readings on (some?) Dells ...

I'm not an expert on IPMI, but the code that patch did remove, looks a bit hacky (may be that is why it was removed?) - however, one side effect of this is to prevent some temperature reading on SC1435s and may be other Dell hardware. I have no idea if the 'real' issue is with ipmitool or the Dell hardware.

However, if you rebuild OpenIPMI without that patch, then ipmitool will work as before when reading temperatures on SC1435s

It is not a kernel issue - you get the same problem using ipmitool talking over the lanplus interface (which goes nowhere near the kernel).

The simple work around is to use ipmitool from the CentOS 4.4 RPM (as the only change to ipmitool between 4.4 and 4.5 were the Woodcrest fixes).

I have created an updated SRPM which reverses that part of the Woodcrest fixes that affect these Dells - if you are interested, the SRPM is at:

<ftp://ftp.moving-picture.com/private/OpenIPMI-1.4.14-1.4E.17a.src.rpm>

I haven't use CentOS 5 in anger (yet), but I guess the issue is the same.

James Pearson
_______________________________________________
CentOS mailing list
CentOS@xxxxxxxxxx
http://lists.centos.org/mailman/listinfo/centos

[Index of Archives]     [CentOS]     [CentOS Announce]     [CentOS Development]     [CentOS ARM Devel]     [CentOS Docs]     [CentOS Virtualization]     [Carrier Grade Linux]     [Linux Media]     [Asterisk]     [DCCP]     [Netdev]     [Xorg]     [Linux USB]
  Powered by Linux