On 3/29/07, Centos-admin <redhat@xxxxxxxxxxx> wrote:
Rasmus Back wrote: > On 3/29/07, Alfred von Campe <alfred@xxxxxxx> wrote: >> On Mar 29, 2007, at 6:36, Rasmus Back wrote: >> >> > I have a Dell SC440 running Centos 4.4. It has two 500GB disks in a >> > RAID1 array using linux software raid (md1 is / and md0 is /boot). >> > Recently the root file system was remounted read-only for some reason. >> > The logs don't show anything unusual, presumably the file system was >> > read-only before anythng was logged. Running dmesg showed this error >> > repeated many times: >> > >> > EXT3-fs error (device md1) in start_transaction: Journal has aborted >> >> I had the exact error 9 months or so ago (look for a similarly titled >> thread in the archives). It was a disk going bad. Get all the data >> off you need now and replace the disk ASAP. It may run for a few >> days/weeks before it gets mounted again read only, but eventually you >> will lose some data. > > Hi Alfred. > > Thanks for the pointer! The smart logs for my drives don't show any > errors but I'll start a long selftest just to be sure. Although if it > is a failing hard drive then the raid driver should kick it out of the > array. Your system was a laptop with just one drive, right? > _______________________________________________ > CentOS mailing list > CentOS@xxxxxxxxxx > http://lists.centos.org/mailman/listinfo/centos > There is a know bug with the mpt scsi driver which causes exactly that behaviour. We got bitten by it running vmware ESX virtual machines with centos 4.4 and rhel 4.4 in them. Esx uses the mpt driver by default, even if your box does not use the raid, then as far as my understanding goes, you could still get the error. It is explained in the links below. Here are some useful links; http://www.tuxyturvy.com/blog/index.php?/archives/31-VMware-ESX-and-ext3-journal-aborts.html https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=197158 http://www.vmware.com/community/thread.jspa?threadID=58121 I have 10 or so real heavy use RHEL 4.4 boxes and at least one box would do this at least once a week. I applied the patch and have not seen the problem again.
Hi Brian, Thanks a million for the links, my system does use the mpt driver (at least according to lspci and lsmod). This would at least give an explanation for the failure. Do you know if the problem is fixed in RHEL 5? The redhat bugzilla said that something has been changed in the mpt drive in 2.6.14, but wasn't clear on if those changes solved the problem. I might upgrade to Centos 5 when it's available anyway. Rasmus _______________________________________________ CentOS mailing list CentOS@xxxxxxxxxx http://lists.centos.org/mailman/listinfo/centos