Re: Root fs suddenly goes r/o

"Stephen John Smoogen" <smooge@xxxxxxxxx> · Wed, 11 Jul 2007 17:18:35 -0600

On 7/11/07, Eduardo Grosclaude <eduardo.grosclaude@xxxxxxxxx> wrote:
Out of the blue, dmesg on my HP Proliant w/ a SCSI disk gives loads of
messages like this one:

 EXT3-fs error (device dm-0) in start_transaction: Journal has aborted

 Then the root fs goes read-only, so little else can be done on the machine.
LVM locks up. At restart, fs needs a reboot to recover after fsck. The host
starts up ok, then I am given some more minutes before the problem
reappears. This is stock CentOS 4.4, never have gotten to update it because
of this very same problem.

 System logs say SCSI I/O error, but SMART says no problem has been found,
neither does badblocks (run from a rescue CD bootup). SCSI cabling,
terminator, etc has been checked.

 What should I investigate next? Is the disk condemned?

SMART isnt fool-proof. I have had disks that go 'clunk/scraping
sounds/spin up' that have gotten SMART seal of approval. My normal
checklist is the above with replacing the items (in case that isnt
what you meant by check).

Replace
terminator
scsi cable
controller
diskdrive

though I usually do disk drive then controller.

--
Stephen J Smoogen. -- CSIRT/Linux System Administrator
How far that little candle throws his beams! So shines a good deed
in a naughty world. = Shakespeare. "The Merchant of Venice"
_______________________________________________
CentOS mailing list
CentOS@xxxxxxxxxx
http://lists.centos.org/mailman/listinfo/centos