On Tue, Aug 16, 2022 at 01:14:55PM -0400, David Wysochanski wrote: > On Tue, Aug 16, 2022 at 9:19 AM Jeff Layton <jlayton@xxxxxxxxxx> wrote: > > > > The i_version in xfs_trans_log_inode is bumped for any inode update, > > including atime-only updates due to reads. We don't want to record those > > in the i_version, as they don't represent "real" changes. Remove that > > callsite. > > > > In xfs_vn_update_time, if S_VERSION is flagged, then attempt to bump the > > i_version and turn on XFS_ILOG_CORE if it happens. In > > xfs_trans_ichgtime, update the i_version if the mtime or ctime are being > > updated. > > > > Cc: Darrick J. Wong <darrick.wong@xxxxxxxxxx> > > Cc: Dave Chinner <david@xxxxxxxxxxxxx> > > Signed-off-by: Jeff Layton <jlayton@xxxxxxxxxx> > > --- > > fs/xfs/libxfs/xfs_trans_inode.c | 17 +++-------------- > > fs/xfs/xfs_iops.c | 4 ++++ > > 2 files changed, 7 insertions(+), 14 deletions(-) > > > > diff --git a/fs/xfs/libxfs/xfs_trans_inode.c b/fs/xfs/libxfs/xfs_trans_inode.c > > index 8b5547073379..78bf7f491462 100644 > > --- a/fs/xfs/libxfs/xfs_trans_inode.c > > +++ b/fs/xfs/libxfs/xfs_trans_inode.c > > @@ -71,6 +71,8 @@ xfs_trans_ichgtime( > > inode->i_ctime = tv; > > if (flags & XFS_ICHGTIME_CREATE) > > ip->i_crtime = tv; > > + if (flags & (XFS_ICHGTIME_MOD|XFS_ICHGTIME_CHG)) > > + inode_inc_iversion(inode); > > } > > > > /* > > @@ -116,20 +118,7 @@ xfs_trans_log_inode( > > spin_unlock(&inode->i_lock); > > } > > > > - /* > > - * First time we log the inode in a transaction, bump the inode change > > - * counter if it is configured for this to occur. While we have the > > - * inode locked exclusively for metadata modification, we can usually > > - * avoid setting XFS_ILOG_CORE if no one has queried the value since > > - * the last time it was incremented. If we have XFS_ILOG_CORE already > > - * set however, then go ahead and bump the i_version counter > > - * unconditionally. > > - */ > > - if (!test_and_set_bit(XFS_LI_DIRTY, &iip->ili_item.li_flags)) { > > - if (IS_I_VERSION(inode) && > > - inode_maybe_inc_iversion(inode, flags & XFS_ILOG_CORE)) > > - iversion_flags = XFS_ILOG_CORE; > > - } > > + set_bit(XFS_LI_DIRTY, &iip->ili_item.li_flags); > > > > /* > > * If we're updating the inode core or the timestamps and it's possible > > diff --git a/fs/xfs/xfs_iops.c b/fs/xfs/xfs_iops.c > > index 45518b8c613c..162e044c7f56 100644 > > --- a/fs/xfs/xfs_iops.c > > +++ b/fs/xfs/xfs_iops.c > > @@ -718,6 +718,7 @@ xfs_setattr_nonsize( > > } > > > > setattr_copy(mnt_userns, inode, iattr); > > + inode_inc_iversion(inode); > > xfs_trans_log_inode(tp, ip, XFS_ILOG_CORE); > > > > XFS_STATS_INC(mp, xs_ig_attrchg); > > @@ -943,6 +944,7 @@ xfs_setattr_size( > > > > ASSERT(!(iattr->ia_valid & (ATTR_UID | ATTR_GID))); > > setattr_copy(mnt_userns, inode, iattr); > > + inode_inc_iversion(inode); > > xfs_trans_log_inode(tp, ip, XFS_ILOG_CORE); > > > > XFS_STATS_INC(mp, xs_ig_attrchg); > > @@ -1047,6 +1049,8 @@ xfs_vn_update_time( > > inode->i_mtime = *now; > > if (flags & S_ATIME) > > inode->i_atime = *now; > > + if ((flags & S_VERSION) && inode_maybe_inc_iversion(inode, false)) > > + log_flags |= XFS_ILOG_CORE; > > > > xfs_trans_ijoin(tp, ip, XFS_ILOCK_EXCL); > > xfs_trans_log_inode(tp, ip, log_flags); > > -- > > 2.37.2 > > > > I have a test (details below) that shows an open issue with NFSv4.x + > fscache where an xfs exported filesystem would trigger unnecessary > over the wire READs after a umount/mount cycle of the NFS mount. I > previously tracked this down to atime updates, but never followed > through on any patch. Now that Jeff worked it out and this patch is > under review, I built 5.19 vanilla, retested, then built 5.19 + this > patch and verified the problem is fixed. And so the question that needs to be answered is "why isn't relatime working for this workload to avoid unnecessary atime updates"? Which then makes me ask "what's changing atime on the server between client side reads"? Which then makes me wonder "what's actually changing iversion on the server?" because I don't think atime is the issue here. I suspect that Jeff's patch is affecting this test case by removing iversion updates when the data is written back on the server. i.e. delayed allocation and unwritten extent conversion will no longer bump iversion when they log the inode metadata changes associated with extent allocation to store the data being written. There may be other places where Jeff's patch removes implicit iversion updates, too, so it may not be writeback that is the issue here. How that impacts on the observed behaviour is dependent on things I don't know, like what cachefiles is doing in the background, especially across NFS client unmount/mount cycles. However, this all makes me think the "atime is updated" behaviour is an observed symptom of something else changing iversion and/or cmtime between reads from the server... Cheers, Dave. -- Dave Chinner david@xxxxxxxxxxxxx