[Bug 14354] Bad corruption with 2.6.32-rc1 and upwards

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



http://bugzilla.kernel.org/show_bug.cgi?id=14354





--- Comment #167 from Eric Sandeen <sandeen@xxxxxxxxxx>  2009-11-02 17:05:38 ---
My test overnight ran successfully through > 100 iterations of the test, on a
tree checked out just prior to d0646f7b636d067d715fab52a2ba9c6f0f46b0d7.

This morning I ran that same tree with the journal checksums enabled via mount
option, saw that journal corruption was found by the checksumming code, and
immediately after that we saw the corruption.  So it is the checksum feature
being on which is breaking this for us.

Linus, I would recommend reverting d0646f7b636d067d715fab52a2ba9c6f0f46b0d7 for
now, at this late stage in the game, and those present on the ext4 call this
morning agreed.

A few things seem to have gone wrong; for one we should have at least issued a
printk when we found a bad journal checksum but we silently continued on thanks
to a RDONLY check (and the root fs is mounted readonly...)

My hand-wavy hunch about what is happening is that we're finding a bad checksum
on the last partially-written transaction, which is not surprising, but if we
have a wrapped log and we're doing the initial scan for head/tail, and we abort
scanning on that bad checksum, then we are essentially running an unrecovered
filesystem.

But that's hand-wavy and I need to go look at the code.

We lived without journal checksums on by default until now, and at this point
they're doing more harm than good, so we should revert the default-changing
commit until we can fix it and do some good power-fail testing with the fixes
in place.

I'll revert that patch and do another overnight test on an up-to-date tree to
be sure nothing else snuck in, but this looks to me like the culprit, and I'm
comfortable recommending that the commit be reverted for now.

Thanks,
-Eric

-- 
Configure bugmail: http://bugzilla.kernel.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are watching the assignee of the bug.
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Reiser Filesystem Development]     [Ceph FS]     [Kernel Newbies]     [Security]     [Netfilter]     [Bugtraq]     [Linux FS]     [Yosemite National Park]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Samba]     [Device Mapper]     [Linux Media]

  Powered by Linux