On Tue, Nov 27, 2012 at 01:31:18PM +0000, Adam Huffman wrote: > > On two machines now I've had severe filesystem corruption. They are > both Fedora 17 machines, and they both have, at some point, run the > kernels that have been mentioned recently as possibly suffering from > ext4 corruption problems. I don't know if you followed the story that closely, but the hysteria over the "ext4 corruption problems" were caused by users who were using non-standard mount options or other ext4 features.... > In the worst case, fsck is unable to fix the problems: > > fsck from util-linux 2.20.1 > e2fsck 1.42.4 (12-June-2012) > ext2fs_check_desc: Corrupt group descriptor: bad block for block bitmap > fsck.ext4: Group descriptors look bad... trying backup blocks... > /dev/mapper/heppc128-lv_home: recovering journal > fsck.ext4: unable to set superblock flags on /dev/mapper/heppc128-lv_home Furthermore, this doesn't look like any of the problems that people have reported. The corruption pattern looks most like what you would see if the blocks in the beginning (low numbered blocks) part of the file system have been overwritten with garbage. So first of all, if there is critical data that you want to preserve, the first thing I'd suggest doing is to make a image copy of the partition; it's only 56 GB, so hopefluly you have space to make a copy before you do any further experimentation to try to recover things. As far as the "unable to set superblock flags" error, I think I can see how that can happen (and in fact I've created a short test case which demonstrates the problem --- see attached), but that appears to be a one shot failure. That is, the second time you run e2fsck, it should be able to make progress. is that the case for you? (It's also possible that there are hardware bugs which is triggering this problem, however, and if in fact you're seeing this happen repeatably, I'd have seriously suspect some kind of hardware failure.) - Ted P.S. In order to get this failure I had to basically use a block editor, since there are software safeguards which prevent e2fsprogs or ext4 from setting the needs_recovery bit on backup superblocks, and this is what was necessary to trigger the bug. I'll fix this for the next release of e2fsprogs. The reason why we hadn't noticed was because (a) it basically requires a very specific hardware-induced bit-flip to trigger, and (b) even when it does, the second run of e2fsck makes the problem go away, so typically it gets noticed when system fails to boot due to e2fsck blowing out, and then when the system administrator runs fsck a second time on the file system, forward progress gets made.
Attachment:
testcase.img.gz
Description: Binary data