FYI, I still will fully NAK any series that adds additional locks and thus atomic instructions to basically every fs call, and grows the inode by a rw_semaphore plus and atomic64_t. I also think the whole idea of switching operation vectors at runtime is fatally flawed and we should never add such code, nevermind just for a fringe usecase of a fringe feature. On Wed, Feb 26, 2020 at 09:24:30PM -0800, ira.weiny@xxxxxxxxx wrote: > From: Ira Weiny <ira.weiny@xxxxxxxxx> > > Changes from V4: > * Open code the aops lock rather than add it to the xfs_ilock() > subsystem (Darrick's comments were obsoleted by this change) > * Fix lkp build suggestions and bugs > > Changes from V3: > * Remove global locking... :-D > * put back per inode locking and remove pre-mature optimizations > * Fix issues with Directories having IS_DAX() set > * Fix kernel crash issues reported by Jeff > * Add some clean up patches > * Consolidate diflags to iflags functions > * Update/add documentation > * Reorder/rename patches quite a bit > > Changes from V2: > > * Move i_dax_sem to be a global percpu_rw_sem rather than per inode > Internal discussions with Dan determined this would be easier, > just as performant, and slightly less overhead that having it > in the SB as suggested by Jan > * Fix locking order in comments and throughout code > * Change "mode" to "state" throughout commits > * Add CONFIG_FS_DAX wrapper to disable inode_[un]lock_state() when not > configured > * Add static branch for which is activated by a device which supports > DAX in XFS > * Change "lock/unlock" to up/down read/write as appropriate > Previous names were over simplified > * Update comments/documentation > > * Remove the xfs specific lock to the vfs (global) layer. > * Fix i_dax_sem locking order and comments > > * Move 'i_mapped' count from struct inode to struct address_space and > rename it to mmap_count > * Add inode_has_mappings() call > > * Fix build issues > * Clean up syntax spacing and minor issues > * Update man page text for STATX_ATTR_DAX > * Add reviewed-by's > * Rebase to 5.6 > > Rename patch: > from: fs/xfs: Add lock/unlock state to xfs > to: fs/xfs: Add write DAX lock to xfs layer > Add patch: > fs/xfs: Clarify lockdep dependency for xfs_isilocked() > Drop patch: > fs/xfs: Fix truncate up > > > At LSF/MM'19 [1] [2] we discussed applications that overestimate memory > consumption due to their inability to detect whether the kernel will > instantiate page cache for a file, and cases where a global dax enable via a > mount option is too coarse. > > The following patch series enables selecting the use of DAX on individual files > and/or directories on xfs, and lays some groundwork to do so in ext4. In this > scheme the dax mount option can be omitted to allow the per-file property to > take effect. > > The insight at LSF/MM was to separate the per-mount or per-file "physical" > capability switch from an "effective" attribute for the file. > > At LSF/MM we discussed the difficulties of switching the DAX state of a file > with active mappings / page cache. It was thought the races could be avoided > by limiting DAX state flips to 0-length files. > > However, this turns out to not be true.[3] This is because address space > operations (a_ops) may be in use at any time the inode is referenced and users > have expressed a desire to be able to change the DAX state on a file with data > in it. For those reasons this patch set allows changing the DAX state flag on > a file as long as it is not current mapped. > > Details of when and how DAX state can be changed on a file is included in a > documentation patch. > > It should be noted that the physical DAX flag inheritance is not shown in this > patch set as it was maintained from previous work on XFS. The physical DAX > flag and it's inheritance will need to be added to other file systems for user > control. > > As submitted this works on real hardware testing. > > > [1] https://lwn.net/Articles/787973/ > [2] https://lwn.net/Articles/787233/ > [3] https://lkml.org/lkml/2019/10/20/96 > [4] https://patchwork.kernel.org/patch/11310511/ > > > To: linux-kernel@xxxxxxxxxxxxxxx > Cc: Alexander Viro <viro@xxxxxxxxxxxxxxxxxx> > Cc: "Darrick J. Wong" <darrick.wong@xxxxxxxxxx> > Cc: Dan Williams <dan.j.williams@xxxxxxxxx> > Cc: Dave Chinner <david@xxxxxxxxxxxxx> > Cc: Christoph Hellwig <hch@xxxxxx> > Cc: "Theodore Y. Ts'o" <tytso@xxxxxxx> > Cc: Jan Kara <jack@xxxxxxx> > Cc: linux-ext4@xxxxxxxxxxxxxxx > Cc: linux-xfs@xxxxxxxxxxxxxxx > Cc: linux-fsdevel@xxxxxxxxxxxxxxx > > > Ira Weiny (12): > fs/xfs: Remove unnecessary initialization of i_rwsem > fs: Remove unneeded IS_DAX() check > fs/stat: Define DAX statx attribute > fs/xfs: Isolate the physical DAX flag from enabled > fs/xfs: Create function xfs_inode_enable_dax() > fs: Add locking for a dynamic address space operations state > fs: Prevent DAX state change if file is mmap'ed > fs/xfs: Hold off aops users while changing DAX state > fs/xfs: Clean up locking in dax invalidate > fs/xfs: Allow toggle of effective DAX flag > fs/xfs: Remove xfs_diflags_to_linux() > Documentation/dax: Update Usage section > > Documentation/filesystems/dax.txt | 84 +++++++++++++++++++++++++- > Documentation/filesystems/vfs.rst | 16 +++++ > fs/attr.c | 1 + > fs/inode.c | 16 ++++- > fs/iomap/buffered-io.c | 1 + > fs/open.c | 4 ++ > fs/stat.c | 5 ++ > fs/xfs/xfs_icache.c | 5 +- > fs/xfs/xfs_inode.h | 2 + > fs/xfs/xfs_ioctl.c | 98 +++++++++++++++---------------- > fs/xfs/xfs_iops.c | 69 +++++++++++++++------- > include/linux/fs.h | 73 ++++++++++++++++++++++- > include/uapi/linux/stat.h | 1 + > mm/fadvise.c | 7 ++- > mm/filemap.c | 4 ++ > mm/huge_memory.c | 1 + > mm/khugepaged.c | 2 + > mm/mmap.c | 19 +++++- > mm/util.c | 9 ++- > 19 files changed, 328 insertions(+), 89 deletions(-) > > -- > 2.21.0 ---end quoted text---