On Wed, Sep 23, 2020 at 07:49:34AM +1000, Dave Chinner wrote: > I did point out in the previous thread that this actually means that > inode_dio_wait() now has inconsistent wait semantics for O_DSYNC > writes. If it's a pure overwrite and we hit the FUA path, the > O_DSYNC write will be complete and guaranteed to be on stable storage > before the IO completes. If the inode is metadata dirty, then the IO > will now be signalled complete *before* the data and metadata are > flushed to stable storage. > > Hence, from the perspective of writes to *stable* storage, this > makes the ordering of O_DSYNC DIO against anything waiting for it to > complete to be potentially inconsistent at the stable storage level. > > That's an extremely subtle change of behaviour, and something that > would be largely impossible to test or reproduce. And, really, I > don't like having this sort of "oh, it should be fine" handwavy > justification when we are talking about data integrity operations... ... and I replied with a detailed analysis of what it is fine, and how this just restores the behavior we historically had before switching to the iomap direct I/O code. Although if we want to go into the fine details we did not have the REQ_FUA path back then, but that does not change the analysis.