On Thu, Aug 24, 2017 at 05:31:26AM -0700, Christoph Hellwig wrote: > On Thu, Aug 17, 2017 at 06:08:15PM +0200, Jan Kara wrote: > > We return IOMAP_F_NEEDDSYNC flag from ext4_iomap_begin() for a > > synchronous write fault when inode has some uncommitted metadata > > changes. In the fault handler ext4_dax_fault() we then detect this case, > > call vfs_fsync_range() to make sure all metadata is committed, and call > > dax_pfn_mkwrite() to mark PTE as writeable. Note that this will also > > dirty corresponding radix tree entry which is what we want - fsync(2) > > will still provide data integrity guarantees for applications not using > > userspace flushing. And applications using userspace flushing can avoid > > calling fsync(2) and thus avoid the performance overhead. > > Why is this only wiered up for the huge_fault handler and not the > regular? Ah, turns out ext4 implements ->fault in terms of ->huge_fault. We'll really need to sort out this mess of fault handlers before doing too much surgery here..