On Wed, Feb 14, 2007 at 11:54:54AM -0800, Valerie Henson wrote: > Just some quick notes on possible ways to fix the ext2 fsync bug that > eXplode found. Whether or not anyone will bother to implement it is > another matter. > > Background: The eXplode file system checker found a bug in ext2 fsync > behavior. Do the following: truncate file A, create file B which > reallocates one of A's old indirect blocks, fsync file B. If you then > crash before file A's metadata is all written out, fsck will complete > the truncate for file A... thereby deleting file B's data. So fsync > file B doesn't guarantee data is on disk after a crash. Details: > > http://www.stanford.edu/~engler/explode-osdi06.pdf > > Two possible solutions I can think of: > > * Rearrange order of duplicate block checking and fixing file size in > fsck. Not sure how hard this is. (Ted?) > > * Keep a set of "still allocated on disk" block bitmaps that gets > flushed whenever a sync happens. Don't allocate these blocks. > Journaling file systems already have to do this. You don't need anything on disk or to fsck to fix this problem - just avoid it completely by keeping a list of recently truncated blocks in memory and don't reuse them until the old owner inode is sync'd to disk. XFS solves this problem in exactly this manner - it keeps a list of recently freed blocks whose freeing transactions have not yet been committed to disk to prevent them from being reused before it is safe to. See xfs_alloc_search_busy() and callers - if we try to reallocate a "busy" extent, we force the log to get the free transaction on disk before allowing the block to be reusued... Cheers, Dave. -- Dave Chinner Principal Engineer SGI Australian Software Group - To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html