On Wed, Mar 25, 2020 at 01:28:23PM +0100, Christoph Hellwig wrote: > In the case that an inode has dirty timestamp for longer than the > lazytime expiration timeout (or if all such inodes are being flushed > out due to a sync or syncfs system call), we need to inform the file > system that the inode is dirty so that the inode's timestamps can be > copied out to the on-disk data structures. That's because if the file > system supports lazytime, it will have ignored the dirty_inode(inode, > I_DIRTY_TIME) notification when the timestamp was modified in memory.q > Previously, this was accomplished by calling mark_inode_dirty_sync(), > but that has the unfortunate side effect of also putting the inode the > writeback list, and that's not necessary in this case, since we will > immediately call write_inode() afterwards. Replace the call to > mark_inode_dirty_sync() with a new lazytime_expired method to clearly > separate out this case. hmmm. Doesn't this cause issues with both iput() and vfs_fsync_range() because they call mark_inode_dirty_sync() on I_DIRTY_TIME inodes to move them onto the writeback list so they are appropriately expired when the inode is written back. i.e.: > diff --git a/fs/xfs/xfs_super.c b/fs/xfs/xfs_super.c > index 2094386af8ac..e5aafd40dd0f 100644 > --- a/fs/xfs/xfs_super.c > +++ b/fs/xfs/xfs_super.c > @@ -612,19 +612,13 @@ xfs_fs_destroy_inode( > } > > static void > -xfs_fs_dirty_inode( > - struct inode *inode, > - int flag) > +xfs_fs_lazytime_expired( > + struct inode *inode) > { > struct xfs_inode *ip = XFS_I(inode); > struct xfs_mount *mp = ip->i_mount; > struct xfs_trans *tp; > > - if (!(inode->i_sb->s_flags & SB_LAZYTIME)) > - return; > - if (flag != I_DIRTY_SYNC || !(inode->i_state & I_DIRTY_TIME)) > - return; > - > if (xfs_trans_alloc(mp, &M_RES(mp)->tr_fsyncts, 0, 0, 0, &tp)) > return; > xfs_ilock(ip, XFS_ILOCK_EXCL); > @@ -1053,7 +1047,7 @@ xfs_fs_free_cached_objects( > static const struct super_operations xfs_super_operations = { > .alloc_inode = xfs_fs_alloc_inode, > .destroy_inode = xfs_fs_destroy_inode, > - .dirty_inode = xfs_fs_dirty_inode, > + .lazytime_expired = xfs_fs_lazytime_expired, > .drop_inode = xfs_fs_drop_inode, > .put_super = xfs_fs_put_super, > .sync_fs = xfs_fs_sync_fs, This means XFS no longer updates/logs the current timestamp because ->dirty_inode(I_DIRTY_SYNC) is no longer called for XFS) before ->fsync flushes the inode data and metadata changes to the journal. Hence the current in-memory timestamps are not present in the log before the fsync is run as so we violate the fsync guarantees lazytime gives for timestamp updates.... I haven't quite got it straight in my head if the iput() case has similar problems, but the fsync case definitely looks broken. Cheers, Dave. -- Dave Chinner david@xxxxxxxxxxxxx