On Sat, Nov 15, 2008 at 08:13:24PM +0200, Rudi Ahlers wrote: > On Sat, Nov 15, 2008 at 7:26 PM, Vandaman <vandaman2002-sk@xxxxxxxxxxx> wrote: > > Rudi Ahlers wrote: > > > >> We have a server which locks up about once a week (for the > >> past 3 ...... > >> How do I debug the server, which runs CentOS 5.2 to see why > >> it locks > >> up? Jumping in the middle of a long list of good ideas. Other things to try -- change the run level if 5 switch to 3 if 3 switch to 5 Reinstall the processor-- remove the processor clean the heat sink and processor of thermal compound correctly apply the best thermal grease you can get (I like Arctic Silver) reinstall the heat sink consider upgrading the processor heat sink if the chassis permits (more Cu is good). Add thermal spreaders to your RAM. You want all the chips on a RAM stick at the same temp. Chkconfig cpuspeed off if it is on (powersaved on some distros) if off toggle to on. Turn off any special system monitoring software tools. Things like I2C serial buses do not isolate simple read only activity from things that might modify (shut down) the system. I have see sites install bluesmoke tools yet the kernel had EDAC installed. The two tools had overlapping uncoordinated interactions with the hardware and would randomly shut down the system. Very new boards are almost never supported well so consider going blind. Read EDAC info on CentOS and RH sites. Inspect then tidy all cables they can mess up air flow and cause thermal issues. Reset the BIOS and check all the BIOS options. Check for a BIOS update from the vendor. When updating the BIOS do a NVRAM reset. The data structures of the old BIOS and new may differ. The keyboard sequence to reset a BIOS to all defaults may require a call to tech support. Call the vendor.. you have a warranty on a new board. Since a hardware tty is not possible login (ssh) and run a "while /bin/true" script that lets you see memory, processes and the exact time things fail or just "top". It is possible to have syslog also log to the pty of a ssh session. When you return to the cage plugin a terminal. If there is no screen saver or screen blanking the GFX card may still display the last key bits of info so long as X is not running. -- T o m M i t c h e l l Found me a new hat, now what? _______________________________________________ CentOS mailing list CentOS@xxxxxxxxxx http://lists.centos.org/mailman/listinfo/centos