On Jan 20, 2006 23:07 -0800, Dennis Williams wrote: > After the fsck finished this evening there were no final statements > refering to problems. I remounted the filesystem without any errors. > After noticing that there were a number of files missing, I started to > attempt to recover from the lost+found directory. I was repeatedly able > to get the the filesystem to error and remount read only when find > traversed a specific directory in lost+found. This is the error message I > recieved from /var/log/messages: > > Jan 21 16:00:26 terrorbytes kernel: EXT3-fs error (device md0): > ext3_readdir: bad entry in directory #73117155: directory entry across > blocks - offset=0, inode=0, rec_len=8196, name_len=84 > Jan 21 16:00:26 terrorbytes kernel: Aborting journal on device md0. > Jan 21 16:00:26 terrorbytes kernel: ext3_abort called. > Jan 21 16:00:26 terrorbytes kernel: EXT3-fs abort (device md0): > ext3_journal_start: Detected aborted journal > Jan 21 16:00:26 terrorbytes kernel: Remounting filesystem read-only > > 1) Can someone explain what this means, and or why it might happen? > 2) Why this condition might exist even after a succesfull fsck? In case it wasn't clear before (I thought it was) you are having problems because this fs is > 2TB. Why, I'm not sure - it may relate to LVM/MD, it may be the block layer, or it may be an ext3 bug. The fact that it is at 2TB makes it seem like a block layer bug or lower. I would start by making a backup if you haven't already. I think debugging it would be easiest if you had a backup and were willing to overwrite the device with a test pattern. If you can isolate the corruptionto a single file or dir, you may get some insight into the problem by running filefrag on it (or "stat {path}" in debugfs. > I am planning on running a fsck yet again. Won't prevent problems from recurring. > > Sincerely, > Dennis Williams > > On Fri, 20 Jan 2006, Dennis Williams wrote: > > > > > > > The system has now been corecting errors for the past 12 hours. I hope > > > > when it finishes, it will mount without complaints. > > > > > > Never belive fsck here. It may check heavy corrupted filesystems serval DAYS. > > > For me (corrupted 120 Gb ext3 partition) "fsck.ext3 -y" work 3 days before i > > > interrupt it. In manual mode, avoid 'duplicate inode clone' and answer yes to > > > 'delete file' - only 30 minutes. > > > > > > > Just out of morbid curiosity what does 'duplicate inode clone' mean? And > > how does the fs get in that state? > > > > The fsck finished this morning with the following final statements: > > > > /dev/md0: ***** FILE SYSTEM WAS MODIFIED ***** > > > > /dev/md0: ********** WARNING: Filesystem still has errors ********** > > > > /dev/md0: 1472505/403685856 files (10.3% non-contiguous), > > 673983041/805797888 blocks > > > > 1) Why would the fs still have errors? Is it correct to assume that > > running fsck again is the answer? (I hope so) > > > > 2) What does the last line of this message mean? > > > > I did notice that the fs mounted correctly after this with the following > > errors in /var/log/messages: > > > > Jan 21 02:09:48 terrorbytes kernel: kjournald starting. Commit interval 5 > > seconds > > Jan 21 02:09:48 terrorbytes kernel: EXT3-fs warning (device md0): > > ext3_clear_journal_err: Filesystem error recorded from previous mount: IO > > failure > > Jan 21 02:09:48 terrorbytes kernel: EXT3-fs warning (device md0): > > ext3_clear_journal_err: Marking fs in need of filesystem check. > > Jan 21 02:09:48 terrorbytes kernel: EXT3-fs warning: mounting unchecked > > fs, running e2fsck is recommended > > Jan 21 02:09:48 terrorbytes kernel: EXT3 FS on md0, internal journal > > Jan 21 02:09:48 terrorbytes kernel: EXT3-fs: mounted filesystem with > > ordered data mode. > > > > after unmounting the filesystem, I ran a standard fsck again: > > terrorbytes:~ # e2fsck /dev/md0 > > e2fsck 1.34 (25-Jul-2003) > > /dev/md0 contains a file system with errors, check forced. > > Pass 1: Checking inodes, blocks, and sizes > > > > Thank you to everyone who has responded to my posts with thier > > suggestions. > > > > Sincerely, > > Dennison Williams > > > > _______________________________________________ > > > > Ext3-users@xxxxxxxxxx > > https://www.redhat.com/mailman/listinfo/ext3-users > > > > _______________________________________________ > > Ext3-users@xxxxxxxxxx > https://www.redhat.com/mailman/listinfo/ext3-users Cheers, Andreas -- Andreas Dilger Principal Software Engineer Cluster File Systems, Inc. _______________________________________________ Ext3-users@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/ext3-users