On Fri, Jan 17, 2025 at 07:01:50PM +0100, Amir Goldstein wrote: > Hi all, > > I would like to present the idea of vfs write barriers that was proposed by Jan > and prototyped for the use of fanotify HSM change tracking events [1]. > > The historical records state that I had mentioned the idea briefly at the end of > my talk in LSFMM 2023 [2], but we did not really have a lot of time to discuss > its wider implications at the time. > > The vfs write barriers are implemented by taking a per-sb srcu read side > lock for the scope of {mnt,file}_{want,drop}_write(). > > This could be used by users - in the case of the prototype - an HSM service - > to wait for all in-flight write syscalls, without blocking new write syscalls > as the stricter fsfreeze() does. > > This ability to wait for in-flight write syscalls is used by the prototype to > implement a crash consistent change tracking method [3] without the > need to use the heavy fsfreeze() hammer. How does this provide anything guarantee at all? It doesn't order or wait for physical IOs in any way, so writeback can be active on a file and writing data from both sides of a syscall write "barrier". i.e. there is no coherency between what is on disk, the cmtime of the inode and the write barrier itself. Freeze is an actual physical write barrier. A very heavy handed physical right barrier, yes, but it has very well defined and bounded physical data persistence semantics. This proposed write barrier does not seem capable of providing any sort of physical data or metadata/data write ordering guarantees, so I'm a bit lost in how it can be used to provide reliable "crash consistent change tracking" when there is no relationship between the data/metadata in memory and data/metadata on disk... -Dave. -- Dave Chinner david@xxxxxxxxxxxxx