https://bugzilla.kernel.org/show_bug.cgi?id=70121 --- Comment #5 from Theodore Tso <tytso@xxxxxxx> --- On Thu, Feb 06, 2014 at 10:38:04AM +0000, bugzilla-daemon@xxxxxxxxxxxxxxxxxxx wrote: > > Here comes the idea: From a logical view to achieve this safety it is not > needed to write the file 2 times. A simple committing should achieve the same > level of safety. Here is an example: The filesystem could store a value for the > file which is reflecting its state. It is initialized as empty value indicating > the file has not successfully be written. As soon as the file has been written > it is set to 1. This would avoid writing the file 2 times and still guarantee > that the file will never be visible for te user in a damaged state on a crash > as the filesystem check would see that the file state is unequal to 1 and > correct the problem. How does the file system know that the file has "successfully been written"? Secondly, even if we did know, in order to guarantee the transaction semantics, we *always* update the journal first. Only after the journals is updated, do we write back to the final location on disk. So what you are suggesting just simply wouldn't work. > ~2 years ago I have disabled full data journaling for a short time but at this > point an application crashed while it was writing a lot of files. The result > was that many files got damaged which encouraged me to never disable full data > journaling again. Now I'm seeing only 2 possible states of a file: Either it is > only registered in the filesystem with a size of 0 bytes or it is completely > written. I was never able to reproduce a half-written file with full data > journaling enabled. You have a buggy application which isn't using fsync() where it should. If you can't fix the application, one thing you can do is to enable use the nodelalloc mount option. Although disabling delayed allocation will involve a performance hit, it's much less of a performance hit compared to data journalling, and it will avoid the double write problem. One of the reasons why I'm not particularly fond of this solution is that, beyond not guaranteeing data integrity after a crash (it just makes it more likely, but if you crash at the wrong moment, you can still lose data --- this is true with data journalling too, btw; if you haven't seen it, you've just gotten lucky), and beyond the fact that it imposes a generic performance, it imposes a specific performance penalty against applications which actually do the correct thing and use fsync(). One of the unfortunate features of ext3, which also didn't have delayed allocation (and ext4 with nodelalloc basically reverts this aspect of file system behaviour to ext3 levels), is that it encouraged applications not to use fsync(), which is a "works most of the time until it doesn't", which is probably _why_ you have the buggy application or applications. But in the long run, it's better to fix the buggy applications than to rely on nodelalloc. Cheers, - Ted -- You are receiving this mail because: You are watching the assignee of the bug. -- To unsubscribe from this list: send the line "unsubscribe linux-ext4" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html