Re: [LSF/MM/BPF TOPIC] vfs write barriers

Dave Chinner <david@xxxxxxxxxxxxx> · Mon, 20 Jan 2025 08:15:41 +1100

On Fri, Jan 17, 2025 at 07:01:50PM +0100, Amir Goldstein wrote:
> Hi all,
> 
> I would like to present the idea of vfs write barriers that was proposed by Jan
> and prototyped for the use of fanotify HSM change tracking events [1].
> 
> The historical records state that I had mentioned the idea briefly at the end of
> my talk in LSFMM 2023 [2], but we did not really have a lot of time to discuss
> its wider implications at the time.
> 
> The vfs write barriers are implemented by taking a per-sb srcu read side
> lock for the scope of {mnt,file}_{want,drop}_write().
> 
> This could be used by users - in the case of the prototype - an HSM service -
> to wait for all in-flight write syscalls, without blocking new write syscalls
> as the stricter fsfreeze() does.
> 
> This ability to wait for in-flight write syscalls is used by the prototype to
> implement a crash consistent change tracking method [3] without the
> need to use the heavy fsfreeze() hammer.

How does this provide anything guarantee at all? It doesn't order or
wait for physical IOs in any way, so writeback can be active on a
file and writing data from both sides of a syscall write "barrier".
i.e. there is no coherency between what is on disk, the cmtime of
the inode and the write barrier itself.

Freeze is an actual physical write barrier. A very heavy handed
physical right barrier, yes, but it has very well defined and
bounded physical data persistence semantics.

This proposed write barrier does not seem capable of providing any
sort of physical data or metadata/data write ordering guarantees, so
I'm a bit lost in how it can be used to provide reliable "crash
consistent change tracking" when there is no relationship between
the data/metadata in memory and data/metadata on disk...

-Dave.
-- 
Dave Chinner
david@xxxxxxxxxxxxx