On Sun, Jul 27, 2014 at 07:37:22PM -0400, Theodore Ts'o wrote: > On Fri, Jul 25, 2014 at 05:34:53PM -0700, Darrick J. Wong wrote: > > There are a few mistakes in the checksum verification error reporting > > logic. First, when we're doing a re-check of an inode that failed > > earlier, we must never ignore checksum errors. Second, if we're > > performing sanity checks after an initial checksum verification > > failure, then we /should/ disable checksum error reporting for > > block_iterate because that function will re-read the inode from disk. > > This fixes the numerous "inode checksum failure" problems that cause > > e2fsck to abort. > > I'm starting to wonder if we just set IGNORE_CSUM_ERRORS when we open > the file system, and explicitly check the checksums in e2fsck. It > might make the logic clearer, especially when we start trying to be a > bit more sophisticated in handling checksum errors. We could make e2fsck verify the checksums itself, though we'd have to provide a way for either (a) libext2fs to provide the raw buffer data to e2fsck or (b) e2fsck to find the block number in question and (re)read the raw buffer, since the checksums are computed against the on-disk structures. This could get a bit nasty for EA and extent handling, since e2fsck doesn't touch the underlying blocks directly, and the checksums are per-block, not per FS object. > Otherwise, if we get a checksum error, we would have to set the flag > and then retry the read. Also doable. --D > > - Ted -- To unsubscribe from this list: send the line "unsubscribe linux-ext4" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html