On Wed, Sep 23, 2020 at 07:16:58AM +0200, Christoph Hellwig wrote: > On Wed, Sep 23, 2020 at 07:49:34AM +1000, Dave Chinner wrote: > > I did point out in the previous thread that this actually means that > > inode_dio_wait() now has inconsistent wait semantics for O_DSYNC > > writes. If it's a pure overwrite and we hit the FUA path, the > > O_DSYNC write will be complete and guaranteed to be on stable storage > > before the IO completes. If the inode is metadata dirty, then the IO > > will now be signalled complete *before* the data and metadata are > > flushed to stable storage. > > > > Hence, from the perspective of writes to *stable* storage, this > > makes the ordering of O_DSYNC DIO against anything waiting for it to > > complete to be potentially inconsistent at the stable storage level. > > > > That's an extremely subtle change of behaviour, and something that > > would be largely impossible to test or reproduce. And, really, I > > don't like having this sort of "oh, it should be fine" handwavy > > justification when we are talking about data integrity operations... > > ... and I replied with a detailed analysis of what it is fine, and > how this just restores the behavior we historically had before > switching to the iomap direct I/O code. Although if we want to go > into the fine details we did not have the REQ_FUA path back then, > but that does not change the analysis. You did? Got a link? Not sure if vger/oraclemail are still delaying messages for me.... :/ --D