Hi, On Thu 08-01-15 09:25:37, Dave Chinner wrote: > This patch set is an attempt to address issues with XFS > truncate and hole-punch code from racing with page faults that enter > the IO path. This is traditionally deadlock prone due to the > inversion of filesystem IO path locks and the mmap_sem. > > To avoid this issue, I have introduced a new "i_mmaplock" rwsem into > the XFS code similar to the IO lock, but this lock is only taken in > the mmap fault paths on entry into the filesystem (i.e. ->fault and > ->page_mkwrite). > > The concept is that if we invalidate the page cache over a range > after taking both the existing i_iolock and the new i_mmaplock, we > will have prevented any vector for repopulation of the page cache > over the invalidated range until one of the io and mmap locks has > been dropped. i.e. we can guarantee that both the syscall IO path > and page faults won't race with whatever operation the filesystem is > performing... > > The introduction of a new lock is necessary to avoid deadlocks due > to mmap_sem entanglement. It has a defined lock order during page > faults of: > > mmap_sem > -> i_mmaplock (read) > -> page lock > -> i_ilock (get blocks) > > This lock is then taken by any extent manipulation code in XFS in > addition to the IO lock which has the lock ordering of > > i_iolock (write) > -> i_mmaplock (write) > -> page lock (data writeback, page invalidation) > -> i_lock (data writeback) > -> i_lock (modification transaction) > > Hence we have consistent lock ordering (which has been validated so > far by testing with lockdep enabled) for page fault IO vs > truncate, hole punch, extent shifts, etc. > > This patchset passes xfstests and various benchmarks and stress > workloads, so the real question is now: > > What have I missed? > > Comments, thoughts, flames? I had a look at the patches and as far as I can tell this should work fine (at least from the VFS / MM POV). Honza -- Jan Kara <jack@xxxxxxx> SUSE Labs, CR -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html