Hi all, Here's a first real pass at XFS log recovery torn write detection. This series has been tested via xfstests and via repetitive fsstress/shutdown sequences followed by simulated CRC errors on log recovery. The latter testing has proven useful in shaking out a few bugs, but I have still reproduced fs inconsistency after a couple hundred iterations or so. That said, I suspect the problems at this point are either actual logging problems (e.g., all of the EFI/EFD logging patches and whatnot originated from this kind of testing) or due to the nature of the error simulation. In short, it simulates log corruption moreso than torn writes because it injects errors at recovery time. The log buffers are written successfully at shutdown time and therefore I believe it's still possible for the filesystem to have modifications that depend on committed transactions (which are ultimately skipped if a crc error is simulated). I've marked this patch RFC for the time being because I'd like to try and come up with something a bit more deterministic, if possible (so long as it can be done reasonably simply). For example, perhaps we can replace it with a similar debug mode that intentionally corrupts a crc at write time and shuts down the fs on write completion such that the AIL is not updated and there is less risk of inconsistency due to writing back metadata items in the "corrupted" log buffer(s). Anyways, the current patch is included so the current test procedure is documented, reviewable and repeatable. Patch 1 is a bug fix for a problem exposed by this mechanism. Patches 2-6 are primarily refactoring and introduce the CRC-check-only log recovery pass. Patch 7 enables log head/tail torn write detection. Patch 8 implements the DEBUG mode error injection mechanism described above. Thoughts, reviews, flames appreciated. Brian v1: - Added bug fix for mkfs log record header inconsistency. - Refactored log recovery code to support a CRC-check-only recovery pass. - CRC verify the last 8 records behind the head to account for concurrent log writes. - Verify the tail of the log as well when the head is torn. - Added (rfc) crc error injection patch for testing purposes. rfc: http://oss.sgi.com/pipermail/xfs/2015-July/042415.html Brian Foster (8): xfs: detect and handle invalid iclog size set by mkfs xfs: refactor log record unpack and data processing xfs: refactor and open code log record crc check xfs: return start block of first bad log record during recovery xfs: support a crc verification only log record pass xfs: refactor log record start detection into a new helper xfs: detect and trim torn writes during log recovery xfs: debug mode log recovery crc error injection fs/xfs/libxfs/xfs_log_recover.h | 1 + fs/xfs/xfs_globals.c | 1 + fs/xfs/xfs_log_recover.c | 646 +++++++++++++++++++++++++++++++++------- fs/xfs/xfs_sysctl.h | 1 + fs/xfs/xfs_sysfs.c | 31 ++ 5 files changed, 574 insertions(+), 106 deletions(-) -- 2.1.0 _______________________________________________ xfs mailing list xfs@xxxxxxxxxxx http://oss.sgi.com/mailman/listinfo/xfs