On Wed, Aug 25, 2021 at 05:49:45PM -0700, Neeraj Singh wrote: > Unfortunately my perusal of the man pages and documentation I could find doesn't > give me this level of confidence on typical Linux filesystems. For > instance, the notion of having to > fsync the parent directory in order to render an inode's link findable > eliminates a lot of the > advantage of this change, though we could batch those and would have > to do at most 256. > > This thread is somewhat instructive, but inconclusive: > https://lwn.net/ml/linux-fsdevel/1552418820-18102-1-git-send-email-jaya@xxxxxxxxxxxxx/. fsync/fdatasync only guarantees consistency for the file handle they are called on. The first linked document mentioned an implementation artifact that file systems with metadata logging tend to force their log out until the last modified transaction and thus force out metadata changes done earlier. This won't help with actual data writes at all, as for them the fact of writing back data will often generate new metadata changes., and in general is not a property to rely on if you care about data integrity. It is nice to optimize the order of the fsync calls for metadata only workloads, as then often the later fsync calls on earlier modified file handles will be no-ops. > One conclusion from reviewing that thread is that as of then, > sync_file_ranges isn't actually enough > to make a hard guarantee about writeout occurring. See > https://lore.kernel.org/linux-fsdevel/20190319204330.GY26298@dastard/. > My hope is that the Linux FS developers have rectified that shortcoming by now. I'm not sure what shortcoming you mean. sync_file_ranges is a system call that only causes data writeback. It never performs metadata write back and thus is not an integrity operation at all. That is also very clearly documented in the man page. > I think my updated version of the documentation for "= false" is > accurate and more helpful > from a user perspective ("up to OS policy when your data becomes durable in > the event of an unclean shutdown"). "= true" also has a reasonable > description, though I > might add some verbiage indicating that this setting could be costly. Your version is much better. In fact it almost still too nice as in general it will not be durable and you do end up with a corrupted repository in that case. Note that even for bad old ext3 that was usually the case.