Re: Core2Quad and very hight temperature

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Sep 26, 2012 at 06:35:21PM +0200, Jean Delvare wrote:
> Hallo Markus,
> 
> On Wed, 26 Sep 2012 12:41:52 +0200 (CEST), rupprecht-admin2, Markus wrote:
> > I'm the admin of a school. Our Server ist a Server has a Intel DG33FB Mainboard
> > (it's Intel G33) with a Intel Core2Quad processor.
> > 
> > We are running Debian Lenny 2.6.35.9
> > 
> > While doing a backup with clonezilla there was the message, that the cpu
> > temperature is too hot and that the cpu will work slower.
> > After restarting the server I draw the four cores in kde systemmonitor (in
> > german it's calles KDE Systemueberwachung).
> > It shows -0,77 and -0,77 and  100 and 95 °C. I guess the decimal number is above
> > 100°C.
> 
> It was more likely an error code which wasn't caught by the software.
> Given the output of "sensors" below, this seems more plausible. The
> coretemp driver can't report a temperature value above the critical
> limit, it is technically not possible.
> 
> > Now I'm unsure, if this could be right.
> > 
> > So I did sensors-detect
> > 12:26/0 server ~ # sensors-detect
> > # sensors-detect revision 5249 (2008-05-11 22:56:25 +0200)
> 
> You do realize this is 4.5 year old, right? Not as old as your board,
> but still, using a more recent version may help:
>   http://dl.lm-sensors.org/lm-sensors/files/sensors-detect
> 
> (Seems to be down right now, you'll have to check later.)
> 
> > Now follows a summary of the probes I have just done.
> > Just press ENTER to continue:
> > 
> > Driver `coretemp' (should be inserted):
> >   Detects correctly:
> >   * Chip `Intel Core family thermal sensor' (confidence: 9)
> > (...)
> > I checked: coretemp is build in.
> > I googled "PC8374L" and fount that i have to use lm85.
> 
> You would if the chip had monitoring enabled, but that's not the case.
> So you can forget about the lm85 driver. This is unfortunate because
> this would have given us a point of comparison. On several Intel boards
> of that era, monitoring was implemented in a way Linux doesn't support.
> 
> Does the BIOS display temperatures and/or other monitoring values?
> 
> > 12:28/1 server ~ # sensors
> > coretemp-isa-0000
> > Adapter: ISA adapter
> > ERROR: Can't get value of subfeature temp1_input: Can't read
> > Core 0:       +0.0 C  (high = +84.0 C, crit = +100.0 C)  ALARM
> > 
> > coretemp-isa-0001
> > Adapter: ISA adapter
> > Core 2:     +100.0 C  (high = +84.0 C, crit = +100.0 C)  ALARM
> > 
> > coretemp-isa-0002
> > Adapter: ISA adapter
> > ERROR: Can't get value of subfeature temp1_input: Can't read
> > Core 1:       +0.0 C  (high = +84.0 C, crit = +100.0 C)  ALARM
> > 
> > coretemp-isa-0003
> > Adapter: ISA adapter
> > Core 3:      +96.0 C  (high = +84.0 C, crit = +100.0 C)
> 
> The errors for cores 0 and 1 are worrisome. We've seen these a couple
> times in the past, but could never explain them nor fix them.
> 
> > Can this be true? On the cpu is a very huge cooler with heatpipes and a large
> > fan. When I touch it, it is not hot. It seams to be mounted ok.
> 
> You did not tell us what exact CPU model your machine has. Different
> models can have very different max TDP values.
> 
> The fact that the heatsink is not hot isn't necessarily a good thing.
> The heat is generated by the CPU and is then expected to dissipate to
> the heatsink, where the fan will extract it, and if the case is
> properly designed, the heat goes outside of the system.
> 
> A cold heatsink can mean that the fan is doing a very good job. But it
> can also mean that the dissipation from the CPU to the heatsink doesn't
> happen, either because insufficient/bad thermal paste, or because the
> heatsink is improperly mounted.
> 
> The fact that you got error messages related to CPU throttling suggest
> the problem is "real", i.e. not a coretemp driver issue. That being
> said, the CPU throttling code is reading its values from the same
> model-specific registers as the coretemp driver, so if these registers
> are somehow busted in your CPU, both will misbehave.
> 
> You may want to give a try to the latest coretemp driver:
>   http://khali.linux-fr.org/devel/misc/coretemp/
> I'm not holding my breath though. Another thing worth trying is a live
> DVD using a more recent kernel.
> 
> But I think that either you have a real overheating problem (check your
> thermal paste and heatsink mounting) or your CPU got somehow damaged.
> 
My guess is that it is an overheating problem. Of course, with the CPU running
that hot, it might well be damaged by now as well.

Guenter

_______________________________________________
lm-sensors mailing list
lm-sensors@xxxxxxxxxxxxxx
http://lists.lm-sensors.org/mailman/listinfo/lm-sensors



[Index of Archives]     [Linux Kernel]     [Linux Hardware Monitoring]     [Linux USB Devel]     [Linux Audio Users]     [Linux Kernel]     [Linux SCSI]     [Yosemite Backpacking]

  Powered by Linux