I've had a couple of complete system hangs on two different systems
in the last couple of weeks. The first time it happened I shrugged
it off, but now that it has happened a second time on a different
system I need to investigate this a bit. further All the systems are
identical: IBM (Lenovo) ThinkCentre M52 with 3GB of memory, running
CentOS 4.4 with all the recent updates.
The systems are currently running the 2.6.9-42.0.3.EL kernel (IIRC,
the first hang happened with the 2.6.9-42.0.2.EL kernel). When it
occurs, the system is completely hung. Whatever was on the screen at
the time continues to be displayed, but the machine doesn't respond
to pings (so you can't ssh into it), the mouse is frozen, you can't
get to the alternate consoles (Ctrl-Alt-F1), etc. In other words,
the only recourse is a reboot. There is absolutely nothing in /var/
log/messages or any other log files that I could find after the
reboot. In both cases, the users report that they were editing a
file when it happened (they weren't even trying to write out changes
to disk).
Anybody else seen anything similar to this? Any recommendations on
how to capture all relevant information should this occur again? How
would you debug a problem like this?
Alfred
_______________________________________________
CentOS mailing list
CentOS@xxxxxxxxxx
http://lists.centos.org/mailman/listinfo/centos