On Thu, Dec 05, 2002 at 08:02:49AM +0100, Stephan Wiehr wrote: > On Wed, Dec 04, 2002 at 10:00:52AM -0500, Theodore Ts'o wrote: > > > Even clearing the has_journal and needs_recovery flags produced the same > > > output using fsck as above. > > > > The exact same messages? Including an error about reading the journal > > superblock? Are you sure about this? That doesn't make any sense at > > all.... > > I was confused as well since I thought this would bring me back to ext2 in > some way. Ah, OK, I see what's going on. The problem is that the journal_inum field is non-zero, e2fsck is tries load the journal before it tries to reconcile the fact that the feature flags say that there is no journal, even though the superblock's journal_inum field points to one. This is normally not a problem, since if the user tells e2fsck to blow away the journal, the loaded journal is discarded, and if the user tells e2fsck to fix the feature flags, things proceed normally. However, if the journal load fails, because the journal inode is corrupted, e2fsck doesn't do the right thing. OK, that's an e2fsck bug, and I can fix it easily enough just be reordering a few lines of code. The workaround is relatively simple; use debugfs to clear the journal_inum field, via the command "set_super_value journal_inum 0", and then e2fsck will stop blowing out due to the bad journal inode. However, before you do this, it might be prudent to see how much damage was done to the inode table. As Andreas Dilger pointed out, apparently every other byte in at least in the part of the journal inode table containing the journal inode has 0xFF. That does not bode well, and was almost certainly caused by a hardware failure of some sort. It might be worth examining some other inode numbers to see how extensive is the damage. Each inode is 128 bytes long, and IDE disk sectors are 512 bytes, so if you're really lucky, only 4 consecutive inodes will be damaged. However, it's much more likely that at least a filesystem block's worth (4096 bytes, or 32 inodes) were lost, and if you're really unlucky, it may be a lot more than that. Also worth considering before you do anything is the cause of the corruption. It could have been caused by the controller or the IDE disk going temporarily insane, in which case hopefully it won't be repeated, but if it is repeatable, doing an image backup will probably be a good idea. Another possibility is that if you had a power failure, one of the things which might have happened is that the memory went insane as the +5 voltage rail dropped down to zero, and but the DMA engine and disk drive were able to keep going long enough to have garbage on the disk drive. What happened prior to the filesystem crash? Did you have a power failure, or did someone hit the Big Red Switch by accident? (Note: normally, the fact that unlike jfs and reiserfs, ext3 uses physical block journalling, helps to protect against this situation, since disk blocks which were being actively written at the time of a power failure are extremely likely to be in the journal as well, so when the journal is replayed, the damage is undone. This doesn't help though when the part of the inode table containing the location of the journal is smashed, so that the system can no longer find the journal.... hence my comment about possibly storing the location of the journal in a redundant location as an additional safety measure.) > Before I do anything like writing to the fs I'd just like to check I'm doing > things right, so here is what I did so far: > The partition that REALLY crashed is /dev/hdb1 which is 2 GB. Moving some > data freed /dev/hdb2 (2,5 GB) for 'backup' so I did a > 'dd if=/dev/hdb1 of=/dev/hdb2 bs=1024 conv=sync' (BTW: Does the bs of dd has > something to do with the blocksize of the fs - which is 4096 - don't know > about this) > So /dev/hdb1 is still 'virgin' concerning the error state (I hope!) and all > experimental stuff I did on /dev/hdb2 (like e2salvage or trying to mount it > as ext2). Still having the originally crashed partition do I need the > Imagefile of e2image or could I skip this since diskspace has now become rare > on that machine. Ah, good. I see you've already done the backup. OK, first of all, at this point, I won't need the e2image. I'm pretty sure I understand why e2fsck acted the way it did, and I know what I need to do to make e2fsck more robust in the future. In answer to your question about the dd blocksize, no, the blocksize used by dd doesn't have to be the same as the blocksize used by the filesystem. Dd's blocksize determines the size that it reads chunks in when doing its I/O. Using a smaller blocksize will slow down the blocksize slightly, but in the case where there is disk block error, you may recover more data, since it will retry on a smaller granularity. Of course, it will only retry on errors, if the dd command line has the conv option "conv=noerror,sync". Without the "noerror" declaration, dd will abort if a disk i/o error is reflected up into userspace. So if the dd command reported any errors, you didn't get a full copy of the filesystem image, and so you may want to retry the disk copy before trying to recover the filesystem. Once you're sure you're working on a clean copy of the filesystem, use debugfs -w to clear the journal flags, and to clear the journal inode number, and then try e2fsck. That will hopefully recover the filesystem into a consistent state, but let me warn you not to set your expectations too high. Between not being able to replay the journal, and part of the inode table getting smashed (so among other things, the root directory is gone), you will almost certainly have a lot of directories ending up in the lost+found directory. So you'll probably be able to recover some of your data, but don't be too surprised if some number of files end up being lost. Good luck!! - Ted _______________________________________________ Ext3-users@redhat.com https://listman.redhat.com/mailman/listinfo/ext3-users