On Mon, Sep 10, 2012 at 8:48 AM, Terry <td3201@xxxxxxxxx> wrote: > On Sun, Sep 9, 2012 at 10:18 PM, Terry <td3201@xxxxxxxxx> wrote: >> On Sun, Sep 9, 2012 at 9:53 PM, Terry <td3201@xxxxxxxxx> wrote: >>> On Sun, Sep 9, 2012 at 9:47 PM, Theodore Ts'o <tytso@xxxxxxx> wrote: >>>> On Sun, Sep 09, 2012 at 09:34:10PM -0500, Terry wrote: >>>>> >>>>> As the subject says, we have a 15 TB fsck drive that won't mount with >>>>> these errors: >>>>> >>>>> Sep 9 20:02:20 narf kernel: EXT4-fs (dm-9): ext4_check_descriptors: >>>>> Inode bitmap for group 3200 not in group (block 4161027887)! >>>>> Sep 9 20:02:20 narf kernel: EXT4-fs (dm-9): group descriptors corrupted! >>>> >>>> These indicate a very basic file system corruption where the block >>>> group descriptors are corrupted. E2fsck will complain immediately >>>> upon seeing this sort of fs inconsistency, and the first thing it will >>>> try to do is fix it. >>>> >>>>> We did a proactive fsck on Tuesday of last week because it was >>>>> starting to give filesystem errors. It ran through and mounted fine. >>>>> >>>>> The filesystem lives on an equallogic SAN spread across 36 drives. >>>>> Could this be something with the physical layer or is it not abnormal >>>>> to have to run multiple rounds of fsck to fully fix an issue? >>>> >>>> This is most probably a hardware problem; normally e2fsck will fix >>>> file system corruptions (and certainly problems such as corrupt block >>>> group scriptors) in a single pass. If e2fsck finished and the file >>>> system mounted fine last week, and now you're getting this kind of >>>> error, it basically screams some kind of physical layer problem, or >>>> perhaps a bad hard drive, or perhaps the SAN disk is getting >>>> incorrectly written to by some other system, etc. >>>> >>>> - Ted >>> >>> Thanks for the reply. It is part of a RHEL cluster but we did not >>> have any situations where multiple systems mounted the filesystem. It >>> is a an old SAN so perhaps we have a physical issue. We'll see what it >>> happens with this pass. >> >> While I am waiting for fsck to finish, another thought. This >> filesystem contains a lot of small files. 35,867,642 files to be >> exact. Anything else I should check or know to ensure a smooth >> operation for these types of filesystems? I formatted them with >> standard RHEL 6 options. > > FSCK completed fixing a lot of things. The file system then mounted > without any errors. We are still getting these types of errors in > /var/log/messages: > > Sep 10 08:40:49 narf kernel: EXT4-fs error (device dm-6): > ext4_dx_find_entry: bad entry in directory #743966900: directory entry > across blocks - block=2975876794offset=0(946176), inode=1414751737, > rec_len=45724, name_len=206 > > Thoughts? Hold that thought. This is another filesystem. Let me fix that one then come back to this problem if it still exists. -- To unsubscribe from this list: send the line "unsubscribe linux-ext4" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html