On Tue, Mar 19, 2019 at 09:35:19AM +0200, Amir Goldstein wrote: > On Tue, Mar 19, 2019 at 5:13 AM Dave Chinner <david@xxxxxxxxxxxxx> wrote: > > That is, sync_file_range() is only safe to use for this specific > > sort of ordered data integrity algorithm when flushing the entire > > file.(*) > > > > create > > setxattr > > write metadata volatile > > delayed allocation data volatile > > .... > > sync_file_range(fd, 0, 0, SYNC_FILE_RANGE_WAIT_BEFORE | > > SYNC_FILE_RANGE_WRITE | SYNC_FILE_RANGE_WAIT_AFTER); > > Extent Allocation metadata volatile > > ----> device -+ > > data volatile > > <-- complete -+ > > .... > > rename metadata volatile > > > > And so at this point, we only need a device cache flush to > > make the data persistent and a journal flush to make the rename > > persistent. And so it ends up the same case as non-AIO O_DIRECT. > > > > Funny, I once told that story and one Dave Chinner told me > "Nice story, but wrong.": > https://patchwork.kernel.org/patch/10576303/#22190719 > > You pointed to the minor detail that sync_file_range() uses > WB_SYNC_NONE. Ah, I forgot about that. That's what I get for not looking at the code. Did I mention that SFR is a complete crock of shit when it comes to data integrity operations? :/ > So yes, I agree, it is a nice story and we need to make it right, > by having an API (perhaps SYNC_FILE_RANGE_ALL). > When you pointed out my mistake, I switched the application to > use the FIEMAP_FLAG_SYNC API as a hack. Yeah, that 's a nasty hack :/ > Besides tests and documentation what could be useful is a portable > user space library that just does the right thing for every filesystem. *nod* but before that, we need the model to be defined and documented. And once we have a library, the fun part of convincing the world that it should be the glibc default behaviour can begin.... Cheers, Dave. -- Dave Chinner david@xxxxxxxxxxxxx