Re: ext3 errors (md device related?)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]



Ross S. W. Walker wrote:

Back to this problem again. I did a new mkfs.ext3 and ran more than a week before hitting this again:

Mar 14 04:12:29 linbackup1 kernel: md3: rw=0, want=14439505280, limit=1465143808
Mar 14 04:12:29 linbackup1 kernel: EXT3-fs error (device md3): ext3_readdir: directory #34079247 contains a hole at offset 0
Mar 14 04:12:29 linbackup1 kernel: Aborting journal on device md3.
Mar 14 04:12:29 linbackup1 kernel: md3: rw=0, want=5260961472, limit=1465143808
Mar 14 04:12:29 linbackup1 kernel: EXT3-fs error (device md3): ext3_readdir: directory #34079247 contains a hole at offset 4096

I don't see any hardware related errors, and the rest of the filesystems all seem fine, although this is the one that is busy.

Is your memory ECC? If not then a memory problem can fly under the radar.

dmidecode says single-bit ECC


Can this be related to being on a 3-member RAID1 that normally runs with one device misssing? I've run a different one that way for a couple of years on earlier kernels.

I haven't seen any other dm-raid problems, and dm-raid is quite mature
at this point. I won't say it isn't possible. Can you try running with
just 2 drives for a while after this fsck and see if it happens again?

I normally run with only 2. I add the 3rd once a week long enough to sync, then unmount the partition long enough to fail and remove the 3rd, then rotate it offsite. The times it has had problems, there have only been 2 active partitions.

Will it hurt anything to mount the underlying partition of one of the drives directly for a while instead of using the md device?

I don't know. Depends how dm-raid keeps it's bitmap and meta-data. If
it's at the end then it should work, if it's at the beginning, then
you'd have to offset the mount (carefully).

You will need to be very careful when messing with the partition table
to change it's type and if you recreate the RAID1 again with existing
data on it (don't have a procedure for that).

I can mount the underlying partition without changing its type and it appears to work. I do that regularly to test the offsite copy but have always later wiped it with a new sync from the live set so I don't know if there is any harm done to using it as an md device after that.

--
  Les Mikesell
   lesmikesell@xxxxxxxxx
_______________________________________________
CentOS mailing list
CentOS@xxxxxxxxxx
http://lists.centos.org/mailman/listinfo/centos

[Index of Archives]     [CentOS]     [CentOS Announce]     [CentOS Development]     [CentOS ARM Devel]     [CentOS Docs]     [CentOS Virtualization]     [Carrier Grade Linux]     [Linux Media]     [Asterisk]     [DCCP]     [Netdev]     [Xorg]     [Linux USB]
  Powered by Linux