Re: EXT3 fs error on RAID1 device

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]



On 3/29/07, Centos-admin <redhat@xxxxxxxxxxx> wrote:
Rasmus Back wrote:
> On 3/29/07, Alfred von Campe <alfred@xxxxxxx> wrote:
>> On Mar 29, 2007, at 6:36, Rasmus Back wrote:
>>
>> > I have a Dell SC440 running Centos 4.4. It has two 500GB disks in a
>> > RAID1 array using linux software raid (md1 is / and md0 is /boot).
>> > Recently the root file system was remounted read-only for some reason.
>> > The logs don't show anything unusual, presumably the file system was
>> > read-only before anythng was logged. Running dmesg showed this error
>> > repeated many times:
>> >
>> > EXT3-fs error (device md1) in start_transaction: Journal has aborted
>>
>> I had the exact error 9 months or so ago (look for a similarly titled
>> thread in the archives).  It was a disk going bad.  Get all the data
>> off you need now and replace the disk ASAP.  It may run for a few
>> days/weeks before it gets mounted again read only, but eventually you
>> will lose some data.
>
> Hi Alfred.
>
> Thanks for the pointer! The smart logs for my drives don't show any
> errors but I'll start a long selftest just to be sure. Although if it
> is a failing hard drive then the raid driver should kick it out of the
> array. Your system was a laptop with just one drive, right?
> _______________________________________________
> CentOS mailing list
> CentOS@xxxxxxxxxx
> http://lists.centos.org/mailman/listinfo/centos
>
There is a know bug with the mpt scsi driver which causes exactly that
behaviour.  We got bitten by it running vmware ESX virtual machines with
centos 4.4 and rhel 4.4 in them. Esx uses the mpt driver by default,
even  if your box does not use the raid, then as far as my understanding
goes, you could still get the error. It is explained in the links below.




Here are some useful links;

http://www.tuxyturvy.com/blog/index.php?/archives/31-VMware-ESX-and-ext3-journal-aborts.html

https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=197158

http://www.vmware.com/community/thread.jspa?threadID=58121



I have 10 or so real heavy use RHEL 4.4 boxes and at least one box would
do this at least once a week. I applied the patch and have not seen the
problem again.


Hi Brian,

Thanks a million for the links, my system does use the mpt driver (at
least according to lspci and lsmod). This would at least give an
explanation for the failure. Do you know if the problem is fixed in
RHEL 5? The redhat bugzilla said that something has been changed in
the mpt drive in 2.6.14, but wasn't clear on if those changes solved
the problem. I might upgrade to Centos 5 when it's available anyway.

Rasmus
_______________________________________________
CentOS mailing list
CentOS@xxxxxxxxxx
http://lists.centos.org/mailman/listinfo/centos

[Index of Archives]     [CentOS]     [CentOS Announce]     [CentOS Development]     [CentOS ARM Devel]     [CentOS Docs]     [CentOS Virtualization]     [Carrier Grade Linux]     [Linux Media]     [Asterisk]     [DCCP]     [Netdev]     [Xorg]     [Linux USB]
  Powered by Linux