Hi Ted and all, I have a couple of questions near the end of this message, but first I have to describe my problem in some detail. The power failure on Thursday did something evil to my ext3 file system (box running RH9+patches, ext3, /dev/md0, raid5 driver, 400GB f/s using 3x200GB IDE drives and one hot-spare). The f/s got corrupt badly and the symptoms are very similar to what Eddy described here: https://www.redhat.com/archives/ext3-users/2003-July/msg00015.html That is, nearly everything I try results in and error such as "Invalid argument while checking ext3 journal for /dev/md0" Ted answered here: https://www.redhat.com/archives/ext3-users/2003-July/msg00035.html and suggested the last ditch approach using mke2fs -S to reinitialize the superblock and group descriptors. After trying all sort of "safe" methods to recover the files, I have tried the -S option as follows: ------------------------------------------------------------------------------ # mke2fs -j -b 4096 -S /dev/md0 mke2fs 1.32 (09-Nov-2002) Filesystem label= OS type: Linux Block size=4096 (log=2) Fragment size=4096 (log=2) 49790976 inodes, 99570816 blocks 4978540 blocks (5.00%) reserved for the super user First data block=0 3039 block groups 32768 blocks per group, 32768 fragments per group 16384 inodes per group Superblock backups stored on blocks: 32768, 98304, 163840, 229376, 294912, 819200, 884736, 1605632, 2654208, 4096000, 7962624, 11239424, 20480000, 23887872, 71663616, 78675968 Creating journal (8192 blocks): mke2fs: File exists while trying to create journal ------------------------------------------------------------------------------ Note that the above command ran too fast for me. It felt as if it didn't actually write any info to the f/s. Indeed, I next ran this command: ------------------------------------------------------------------------------ # e2fsck -b 98304 -B 4096 /dev/md0 e2fsck 1.32 (09-Nov-2002) e2fsck: Invalid argument while checking ext3 journal for /dev/md0 ------------------------------------------------------------------------------ And once again got this error wrt the journal. Note that before I even tried this -S procedure, I tried to simply turn off the has_journal bit using tune2fs: didn't help. (I'm willing to lose the info in the journal, as long as I can get the rest of my large f/s.) But tune2fs and friends gave me a chicken-and-egg error about the invalid arg wrt the journal, while trying to turn it off (duhh). At this point I've begun to suspect that there's something awfully wrong with the journal inode, and maybe, just maybe, my superblocks and group descriptors might be intact. Next, I tried to reinitialize the superblocks and group descriptors WITHOUT a journal (tell mke2fs to make a plain ext2 f/s): ------------------------------------------------------------------------------ # mke2fs -b 4096 -S /dev/md0 mke2fs 1.32 (09-Nov-2002) Filesystem label= OS type: Linux Block size=4096 (log=2) Fragment size=4096 (log=2) 49790976 inodes, 99570816 blocks 4978540 blocks (5.00%) reserved for the super user First data block=0 3039 block groups 32768 blocks per group, 32768 fragments per group 16384 inodes per group Superblock backups stored on blocks: 32768, 98304, 163840, 229376, 294912, 819200, 884736, 1605632, 2654208, 4096000, 7962624, 11239424, 20480000, 23887872, 71663616, 78675968 Writing superblocks and filesystem accounting information: done This filesystem will be automatically checked every 34 mounts or 180 days, whichever comes first. Use tune2fs -c or -i to override. ------------------------------------------------------------------------------ Bingo. This time I got no error, and the command took a couple of seconds longer, indicating to me that it actually did write something to the disk (or maybe it wrote more than when I tried "-S -j"). Now I was able to start "e2fsck -b 71663616 -B 4096 /dev/md0". It's been running for a couple of hours already. Of course, it's discovering all sorts of wonderful new events and spewing messages I've never even seen before. 1/2 a :-) Anyway, my hypothesis now is that the f/s in question may have just had a really really bad journal inode on it that was preventing anything else from happening, and that perhaps I shouldn't have tried "mke2fs -S" above had I been able to just nuke the pesky journal (it might have prevented further corruption that fsck is now "fixing"). The good news is that prior to experimentation, I have made a dd backup of /dev/md0 (400GB) onto a file on another file server (1.5T), so I can dd it back onto my real /dev/md0 if need be. Alternatively, I can make a second copy of that backup file, use losetup on the second copy, and then experiment. Questions: 1. Is there any reason why I couldn't experiment with e2fsprogs binaries on a f/s dd image mounted over /dev/loopN? I.e., will it behave the same as a disk device as far as e2fsprogs are concerned? 2. If my assertion is correct that most of my f/s is intact but the journal is FUBAR, I need to find a way to force fsck to ignore the journal no matter what. Is there such a tool or option to some tool? Is there a way I could simply scan the disk and truncate the journal file, or turn off the has_journal bit w/o touching the rest of the f/s? Any suggestions are welcome. Thanks, Erez. _______________________________________________ Ext3-users@redhat.com https://www.redhat.com/mailman/listinfo/ext3-users