On Tue, Jan 06, 2015 at 09:53:47AM +0100, Jan Kara wrote: > On Tue 06-01-15 08:47:55, Dave Chinner wrote: > > > As things stand now the other devs are loathe to touch any remotely exotic > > > fs call, but that hardly seems ideal. Hopefully a common framework for > > > powerfail testing can improve on this. Perhaps there are other ways we > > > make it easier to tell what is (well) tested, and conversely ensure that > > > those tests are well-aligned with what real users are doing... > > > > We don't actually need power failure (or even device failure) > > infrastructure to test data integrity on failure. Filesystems just > > need a shutdown method that stops any IO from being issued once the > > shutdown flag is set. XFS has this and it's used by xfstests via the > > "godown" utility to shut the fileystem down in various > > circumstances. We've been using this for data integrity and log > > recovery testing in xfstests for many years. > > > > Hence we know if the device behaves correctly w.r.t cache flushes > > and FUA then the filesystem will behave correctly on power loss. We > > don't need a device power fail simulator to tell us violating > > fundamental architectural assumptions will corrupt filesystems.... > I think that fs ioctl cannot easily simulate the situation where > on-device volatile caches aren't properly flushed in all the necessary > cases (we had a bugs like this in ext3/4 in the past which were hit by real > users). Sure, I'm not arguing that it does. I'm suggesting that it's the wrong place to be focussing effort on initially as it assumes the filesystem behaves correctly on simple device failures. i.e. if filesystems fail to do the right thing on a block device that isn't lossy, then we've got big problems to solve before we even consider random "volatile cache blocks went missing" corruption and recovery issues. i.e. what we need to focus on first is "failure paths are exercised and work reliably". When we have decent coverage of that for most filesystems (and we sure as hell don't for btrfs and ext4), then we can focus on "in this corner case of broken/lying hardware..." > I also think that simulating the device failure in a different layer is > simpler than checking for superblock flag in all the places where the > filesystem submits IO (e.g. ext4 doesn't have dedicated buffer layer like > xfs has and we rely on flusher thread to flush committed metadata to final flusher threads call back into the filesystems to write both data and metadata, so I don't think that's an issue. And there's realtively few places you'd need to add a flag support to (ie. wrappers around submit_bh and submit_bio in the relavent layers) and that would trap all IO. Don't get fooled by the fact that XFS has lots of shutdown traps; there really are only three shutdown traps that prevent IO - one in xfs_buf_submit() for metadata IO, one in xfs_map_blocks() during ->writepage for data IO, and one in xlog_bdstrat() for log IO. All the other shutdown traps are for aborting operations that may not reach the IO layer (as many operations will hit cached objects) or will fail later when the inevitable IO is done (e.g. on transaction commit). Hence shutdown traps get us fast, reliable responses to userspace when fatal corruption errors occur, and in doing so they also provide hooks for testing error paths in ways that otherwise are very difficult to exercise. This is my point - shutdown traps are far more useful for *verifying correct filesystem behaviour in error situations* than something that just returns errors or corrupts blocks at the IO layer. If we really want to test behaviour with corrupt random disk blocks, fsfuzzer already exists ;) > location on disk so that writeback path completely avoids ext4 code - it's > a generic writeback of the block device mapping). So I like the solution > with the dm target more than a fs ioctl although I agree that it's more > clumsy from the xfstests perspective. Wrong perspective. I'm looking at this from a filesystem layer validation perspective, not a xfstests perspective. The fs ioctl is far more useful for exercising and validation filesystem behaviour in error conditions than a dm-device that targets a rare device failure issue. Cheers, Dave. -- Dave Chinner david@xxxxxxxxxxxxx -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html