Hi, When doing a 'modprobe coretemp' on a system installed with RHEL6 (which is based on linux-2.6.32), I'm seeing "Unable to access MSR 0xEE, for Tjmax, left at default" type errors. My question is whether this is what I should expect from the system I'm running on? Some of the (hopefully) relevant details can be found in the attached file. Thanks, Dean
|||||||||||||||||||||||||||||||||||||||||||||||||||||||| [root@dell-per710-01 ~]# modprobe coretemp [root@dell-per710-01 ~]# sensors -s [root@dell-per710-01 ~]# dmesg | tail -20 bnx2: eth1 NIC Copper Link is Up, 1000 Mbps full duplex ADDRCONF(NETDEV_CHANGE): eth1: link becomes ready eth2: no IPv6 routers present eth1: no IPv6 routers present coretemp coretemp.0: Unable to access MSR 0xEE, for Tjmax, left at default coretemp coretemp.1: Unable to access MSR 0xEE, for Tjmax, left at default coretemp coretemp.2: Unable to access MSR 0xEE, for Tjmax, left at default coretemp coretemp.3: Unable to access MSR 0xEE, for Tjmax, left at default coretemp coretemp.4: Unable to access MSR 0xEE, for Tjmax, left at default coretemp coretemp.5: Unable to access MSR 0xEE, for Tjmax, left at default coretemp coretemp.6: Unable to access MSR 0xEE, for Tjmax, left at default coretemp coretemp.7: Unable to access MSR 0xEE, for Tjmax, left at default coretemp coretemp.8: Unable to access MSR 0xEE, for Tjmax, left at default coretemp coretemp.9: Unable to access MSR 0xEE, for Tjmax, left at default coretemp coretemp.10: Unable to access MSR 0xEE, for Tjmax, left at default coretemp coretemp.11: Unable to access MSR 0xEE, for Tjmax, left at default coretemp coretemp.12: Unable to access MSR 0xEE, for Tjmax, left at default coretemp coretemp.13: Unable to access MSR 0xEE, for Tjmax, left at default coretemp coretemp.14: Unable to access MSR 0xEE, for Tjmax, left at default coretemp coretemp.15: Unable to access MSR 0xEE, for Tjmax, left at default [root@dell-per710-01 test]# sensors coretemp-isa-0000 Adapter: ISA adapter Core 0: +38.0°C (high = +90.0°C, crit = +100.0°C) coretemp-isa-0001 Adapter: ISA adapter Core 1: +37.0°C (high = +90.0°C, crit = +100.0°C) coretemp-isa-0002 Adapter: ISA adapter Core 2: +35.0°C (high = +90.0°C, crit = +100.0°C) coretemp-isa-0003 Adapter: ISA adapter Core 3: +39.0°C (high = +90.0°C, crit = +100.0°C) coretemp-isa-0004 Adapter: ISA adapter Core 4: +39.0°C (high = +90.0°C, crit = +100.0°C) coretemp-isa-0005 Adapter: ISA adapter Core 5: +39.0°C (high = +90.0°C, crit = +100.0°C) coretemp-isa-0006 Adapter: ISA adapter Core 6: +36.0°C (high = +90.0°C, crit = +100.0°C) coretemp-isa-0007 Adapter: ISA adapter Core 7: +35.0°C (high = +90.0°C, crit = +100.0°C) coretemp-isa-0008 Adapter: ISA adapter Core 8: +38.0°C (high = +90.0°C, crit = +100.0°C) coretemp-isa-0009 Adapter: ISA adapter Core 9: +37.0°C (high = +90.0°C, crit = +100.0°C) coretemp-isa-000a Adapter: ISA adapter Core 10: +35.0°C (high = +90.0°C, crit = +100.0°C) coretemp-isa-000b Adapter: ISA adapter Core 11: +39.0°C (high = +90.0°C, crit = +100.0°C) coretemp-isa-000c Adapter: ISA adapter Core 12: +39.0°C (high = +90.0°C, crit = +100.0°C) coretemp-isa-000d Adapter: ISA adapter Core 13: +39.0°C (high = +90.0°C, crit = +100.0°C) coretemp-isa-000e Adapter: ISA adapter Core 14: +35.0°C (high = +90.0°C, crit = +100.0°C) coretemp-isa-000f Adapter: ISA adapter Core 15: +35.0°C (high = +90.0°C, crit = +100.0°C) [root@dell-per710-01 test]# |||||||||||||||||||||||||||||||||||||||||||||||||||||||| The system I'm running on has 16 CPUs, all much like the following one: [root@dell-per710-01 ~]# cat /proc/cpuinfo processor : 0 vendor_id : GenuineIntel cpu family : 6 model : 26 model name : Intel(R) Xeon(R) CPU E5530 @ 2.40GHz stepping : 5 cpu MHz : 2393.589 cache size : 8192 KB physical id : 1 siblings : 8 core id : 0 cpu cores : 4 apicid : 16 initial apicid : 16 fpu : yes fpu_exception : yes cpuid level : 11 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall rdtscp lm constant_tsc arch_perfmon pebs bts rep_good xtopology tsc_reliable nonstop_tsc aperfmperf pni dtes64 monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr pdcm dca sse4_1 sse4_2 popcnt lahf_lm ida tpr_shadow vnmi flexpriority ept vpid bogomips : 4787.17 clflush size : 64 cache_alignment : 64 address sizes : 40 bits physical, 48 bits virtual power management: <snip> [root@dell-per710-01 ~]# |||||||||||||||||||||||||||||||||||||||||||||||||||||||| [root@dell-per710-01 ~]# sensors-detect # sensors-detect revision 1.1 # System: Dell Inc. PowerEdge R710 # Board: Dell Inc. 0M233H This program will help you determine which kernel modules you need to load to use lm_sensors most effectively. It is generally safe and recommended to accept the default answers to all questions, unless you know what you're doing. Some south bridges, CPUs or memory controllers contain embedded sensors. Do you want to scan for them? This is totally safe. (YES/no): Silicon Integrated Systems SIS5595... No VIA VT82C686 Integrated Sensors... No VIA VT8231 Integrated Sensors... No AMD K8 thermal sensors... No AMD Family 11h thermal sensors... No Intel Core family thermal sensor... Success! (driver `coretemp') Intel AMB FB-DIMM thermal sensor... No VIA C7 thermal and voltage sensors... No Some Super I/O chips contain embedded sensors. We have to write to standard I/O ports to probe them. This is usually safe. Do you want to scan for Super I/O sensors? (YES/no): no Some systems (mainly servers) implement IPMI, a set of common interfaces through which system health data may be retrieved, amongst other things. We first try to get the information from SMBIOS. If we don't find it there, we have to read from arbitrary I/O ports to probe for such interfaces. This is normally safe. Do you want to scan for IPMI interfaces? (YES/no): no Some hardware monitoring chips are accessible through the ISA I/O ports. We have to write to arbitrary I/O ports to probe them. This is usually safe though. Yes, you do have ISA I/O ports even if you do not have any ISA slots! Do you want to scan the ISA I/O ports? (YES/no): no Lastly, we can probe the I2C/SMBus adapters for connected hardware monitoring devices. This is the most risky part, and while it works reasonably well on most systems, it has been reported to cause trouble on some systems. Do you want to probe the I2C/SMBus adapters now? (YES/no): no Now follows a summary of the probes I have just done. Just press ENTER to continue: Driver `coretemp': * Chip `Intel Core family thermal sensor' (confidence: 9) Do you want to overwrite /etc/sysconfig/lm_sensors? (YES/no): no To load everything that is needed, add this to one of the system initialization scripts (e.g. /etc/rc.d/rc.local): #----cut here---- # Chip drivers modprobe coretemp /usr/bin/sensors -s #----cut here---- If you have some drivers built into your kernel, the list above will contain too many modules. Skip the appropriate ones! You really should try these commands right now to make sure everything is working properly. Monitoring programs won't work until the needed modules are loaded. [root@dell-per710-01 ~]# |||||||||||||||||||||||||||||||||||||||||||||||||||||||| I also ran after adding a debug printk() following each call to rdmsr_safe_on_cpu() in coretemp.c. And also setting eax and edx to 0 before each call. The following was what I pulled from dmesg after running 'modprobe coretemp'. Notice for 0xee rdmsr_safe_on_cpu() returns a -5 or -EIO. Are the values of eax=0x200200 and edx=0x100100 meaningful (aside from somewhat resembling LIST_POISON2 and LIST_POISON1)? Also notice that for CPU 11, eax=0x246 and edx=0xa655c68. debug: rdmsr_safe_on_cpu(id=0x0, 0x19c,) err=0 eax=0x883e0008 edx=0x0 debug: rdmsr_safe_on_cpu(id=0x0, 0x17,) err=0 eax=0x0 edx=0x0 debug: rdmsr_safe_on_cpu(id=0x0, 0xee,) err=-5 eax=0x200200 edx=0x100100 coretemp coretemp.0: Unable to access MSR 0xEE, for Tjmax, left at default debug: rdmsr_safe_on_cpu(id=0x0, 0x1a2,) err=0 eax=0x610a00 edx=0x0 debug: rdmsr_safe_on_cpu(id=0x1, 0x19c,) err=0 eax=0x883e0008 edx=0x0 debug: rdmsr_safe_on_cpu(id=0x1, 0x17,) err=0 eax=0x0 edx=0x0 debug: rdmsr_safe_on_cpu(id=0x1, 0xee,) err=-5 eax=0x200200 edx=0x100100 coretemp coretemp.1: Unable to access MSR 0xEE, for Tjmax, left at default debug: rdmsr_safe_on_cpu(id=0x1, 0x1a2,) err=0 eax=0x610a00 edx=0x0 debug: rdmsr_safe_on_cpu(id=0x2, 0x19c,) err=0 eax=0x88400008 edx=0x0 debug: rdmsr_safe_on_cpu(id=0x2, 0x17,) err=0 eax=0x0 edx=0x0 debug: rdmsr_safe_on_cpu(id=0x2, 0xee,) err=-5 eax=0x200200 edx=0x100100 coretemp coretemp.2: Unable to access MSR 0xEE, for Tjmax, left at default debug: rdmsr_safe_on_cpu(id=0x2, 0x1a2,) err=0 eax=0x610a00 edx=0x0 debug: rdmsr_safe_on_cpu(id=0x3, 0x19c,) err=0 eax=0x883d0008 edx=0x0 debug: rdmsr_safe_on_cpu(id=0x3, 0x17,) err=0 eax=0x0 edx=0x0 debug: rdmsr_safe_on_cpu(id=0x3, 0xee,) err=-5 eax=0x200200 edx=0x100100 coretemp coretemp.3: Unable to access MSR 0xEE, for Tjmax, left at default debug: rdmsr_safe_on_cpu(id=0x3, 0x1a2,) err=0 eax=0x610a00 edx=0x0 debug: rdmsr_safe_on_cpu(id=0x4, 0x19c,) err=0 eax=0x883d0008 edx=0x0 debug: rdmsr_safe_on_cpu(id=0x4, 0x17,) err=0 eax=0x0 edx=0x0 debug: rdmsr_safe_on_cpu(id=0x4, 0xee,) err=-5 eax=0x200200 edx=0x100100 coretemp coretemp.4: Unable to access MSR 0xEE, for Tjmax, left at default debug: rdmsr_safe_on_cpu(id=0x4, 0x1a2,) err=0 eax=0x610a00 edx=0x0 debug: rdmsr_safe_on_cpu(id=0x5, 0x19c,) err=0 eax=0x883d0008 edx=0x0 debug: rdmsr_safe_on_cpu(id=0x5, 0x17,) err=0 eax=0x0 edx=0x0 debug: rdmsr_safe_on_cpu(id=0x5, 0xee,) err=-5 eax=0x200200 edx=0x100100 coretemp coretemp.5: Unable to access MSR 0xEE, for Tjmax, left at default debug: rdmsr_safe_on_cpu(id=0x5, 0x1a2,) err=0 eax=0x610a00 edx=0x0 debug: rdmsr_safe_on_cpu(id=0x6, 0x19c,) err=0 eax=0x88400008 edx=0x0 debug: rdmsr_safe_on_cpu(id=0x6, 0x17,) err=0 eax=0x0 edx=0x0 debug: rdmsr_safe_on_cpu(id=0x6, 0xee,) err=-5 eax=0x200200 edx=0x100100 coretemp coretemp.6: Unable to access MSR 0xEE, for Tjmax, left at default debug: rdmsr_safe_on_cpu(id=0x6, 0x1a2,) err=0 eax=0x610a00 edx=0x0 debug: rdmsr_safe_on_cpu(id=0x7, 0x19c,) err=0 eax=0x88410008 edx=0x0 debug: rdmsr_safe_on_cpu(id=0x7, 0x17,) err=0 eax=0x0 edx=0x0 debug: rdmsr_safe_on_cpu(id=0x7, 0xee,) err=-5 eax=0x200200 edx=0x100100 coretemp coretemp.7: Unable to access MSR 0xEE, for Tjmax, left at default debug: rdmsr_safe_on_cpu(id=0x7, 0x1a2,) err=0 eax=0x610a00 edx=0x0 debug: rdmsr_safe_on_cpu(id=0x8, 0x19c,) err=0 eax=0x883d0008 edx=0x0 debug: rdmsr_safe_on_cpu(id=0x8, 0x17,) err=0 eax=0x0 edx=0x0 debug: rdmsr_safe_on_cpu(id=0x8, 0xee,) err=-5 eax=0x200200 edx=0x100100 coretemp coretemp.8: Unable to access MSR 0xEE, for Tjmax, left at default debug: rdmsr_safe_on_cpu(id=0x8, 0x1a2,) err=0 eax=0x610a00 edx=0x0 debug: rdmsr_safe_on_cpu(id=0x9, 0x19c,) err=0 eax=0x883e0008 edx=0x0 debug: rdmsr_safe_on_cpu(id=0x9, 0x17,) err=0 eax=0x0 edx=0x0 debug: rdmsr_safe_on_cpu(id=0x9, 0xee,) err=-5 eax=0x200200 edx=0x100100 coretemp coretemp.9: Unable to access MSR 0xEE, for Tjmax, left at default debug: rdmsr_safe_on_cpu(id=0x9, 0x1a2,) err=0 eax=0x610a00 edx=0x0 debug: rdmsr_safe_on_cpu(id=0xa, 0x19c,) err=0 eax=0x88400008 edx=0x0 debug: rdmsr_safe_on_cpu(id=0xa, 0x17,) err=0 eax=0x0 edx=0x0 debug: rdmsr_safe_on_cpu(id=0xa, 0xee,) err=-5 eax=0x200200 edx=0x100100 coretemp coretemp.10: Unable to access MSR 0xEE, for Tjmax, left at default debug: rdmsr_safe_on_cpu(id=0xa, 0x1a2,) err=0 eax=0x610a00 edx=0x0 debug: rdmsr_safe_on_cpu(id=0xb, 0x19c,) err=0 eax=0x883d0008 edx=0x0 debug: rdmsr_safe_on_cpu(id=0xb, 0x17,) err=0 eax=0x0 edx=0x0 debug: rdmsr_safe_on_cpu(id=0xb, 0xee,) err=-5 eax=0x246 edx=0xa655c68 coretemp coretemp.11: Unable to access MSR 0xEE, for Tjmax, left at default debug: rdmsr_safe_on_cpu(id=0xb, 0x1a2,) err=0 eax=0x610a00 edx=0x0 debug: rdmsr_safe_on_cpu(id=0xc, 0x19c,) err=0 eax=0x883d0008 edx=0x0 debug: rdmsr_safe_on_cpu(id=0xc, 0x17,) err=0 eax=0x0 edx=0x0 debug: rdmsr_safe_on_cpu(id=0xc, 0xee,) err=-5 eax=0x200200 edx=0x100100 coretemp coretemp.12: Unable to access MSR 0xEE, for Tjmax, left at default debug: rdmsr_safe_on_cpu(id=0xc, 0x1a2,) err=0 eax=0x610a00 edx=0x0 debug: rdmsr_safe_on_cpu(id=0xd, 0x19c,) err=0 eax=0x883d0008 edx=0x0 debug: rdmsr_safe_on_cpu(id=0xd, 0x17,) err=0 eax=0x0 edx=0x0 debug: rdmsr_safe_on_cpu(id=0xd, 0xee,) err=-5 eax=0x200200 edx=0x100100 coretemp coretemp.13: Unable to access MSR 0xEE, for Tjmax, left at default debug: rdmsr_safe_on_cpu(id=0xd, 0x1a2,) err=0 eax=0x610a00 edx=0x0 debug: rdmsr_safe_on_cpu(id=0xe, 0x19c,) err=0 eax=0x88400008 edx=0x0 debug: rdmsr_safe_on_cpu(id=0xe, 0x17,) err=0 eax=0x0 edx=0x0 debug: rdmsr_safe_on_cpu(id=0xe, 0xee,) err=-5 eax=0x200200 edx=0x100100 coretemp coretemp.14: Unable to access MSR 0xEE, for Tjmax, left at default debug: rdmsr_safe_on_cpu(id=0xe, 0x1a2,) err=0 eax=0x610a00 edx=0x0 debug: rdmsr_safe_on_cpu(id=0xf, 0x19c,) err=0 eax=0x88410008 edx=0x0 debug: rdmsr_safe_on_cpu(id=0xf, 0x17,) err=0 eax=0x0 edx=0x0 debug: rdmsr_safe_on_cpu(id=0xf, 0xee,) err=-5 eax=0x200200 edx=0x100100 coretemp coretemp.15: Unable to access MSR 0xEE, for Tjmax, left at default debug: rdmsr_safe_on_cpu(id=0xf, 0x1a2,) err=0 eax=0x610a00 edx=0x0 ||||||||||||||||||||||||||||||||||||||||||||||||||||||||
_______________________________________________ lm-sensors mailing list lm-sensors@xxxxxxxxxxxxxx http://lists.lm-sensors.org/mailman/listinfo/lm-sensors