On Tue, Aug 16, 2022 at 9:19 AM Jeff Layton <jlayton@xxxxxxxxxx> wrote: > > The i_version in xfs_trans_log_inode is bumped for any inode update, > including atime-only updates due to reads. We don't want to record those > in the i_version, as they don't represent "real" changes. Remove that > callsite. > > In xfs_vn_update_time, if S_VERSION is flagged, then attempt to bump the > i_version and turn on XFS_ILOG_CORE if it happens. In > xfs_trans_ichgtime, update the i_version if the mtime or ctime are being > updated. > > Cc: Darrick J. Wong <darrick.wong@xxxxxxxxxx> > Cc: Dave Chinner <david@xxxxxxxxxxxxx> > Signed-off-by: Jeff Layton <jlayton@xxxxxxxxxx> > --- > fs/xfs/libxfs/xfs_trans_inode.c | 17 +++-------------- > fs/xfs/xfs_iops.c | 4 ++++ > 2 files changed, 7 insertions(+), 14 deletions(-) > > diff --git a/fs/xfs/libxfs/xfs_trans_inode.c b/fs/xfs/libxfs/xfs_trans_inode.c > index 8b5547073379..78bf7f491462 100644 > --- a/fs/xfs/libxfs/xfs_trans_inode.c > +++ b/fs/xfs/libxfs/xfs_trans_inode.c > @@ -71,6 +71,8 @@ xfs_trans_ichgtime( > inode->i_ctime = tv; > if (flags & XFS_ICHGTIME_CREATE) > ip->i_crtime = tv; > + if (flags & (XFS_ICHGTIME_MOD|XFS_ICHGTIME_CHG)) > + inode_inc_iversion(inode); > } > > /* > @@ -116,20 +118,7 @@ xfs_trans_log_inode( > spin_unlock(&inode->i_lock); > } > > - /* > - * First time we log the inode in a transaction, bump the inode change > - * counter if it is configured for this to occur. While we have the > - * inode locked exclusively for metadata modification, we can usually > - * avoid setting XFS_ILOG_CORE if no one has queried the value since > - * the last time it was incremented. If we have XFS_ILOG_CORE already > - * set however, then go ahead and bump the i_version counter > - * unconditionally. > - */ > - if (!test_and_set_bit(XFS_LI_DIRTY, &iip->ili_item.li_flags)) { > - if (IS_I_VERSION(inode) && > - inode_maybe_inc_iversion(inode, flags & XFS_ILOG_CORE)) > - iversion_flags = XFS_ILOG_CORE; > - } > + set_bit(XFS_LI_DIRTY, &iip->ili_item.li_flags); > > /* > * If we're updating the inode core or the timestamps and it's possible > diff --git a/fs/xfs/xfs_iops.c b/fs/xfs/xfs_iops.c > index 45518b8c613c..162e044c7f56 100644 > --- a/fs/xfs/xfs_iops.c > +++ b/fs/xfs/xfs_iops.c > @@ -718,6 +718,7 @@ xfs_setattr_nonsize( > } > > setattr_copy(mnt_userns, inode, iattr); > + inode_inc_iversion(inode); > xfs_trans_log_inode(tp, ip, XFS_ILOG_CORE); > > XFS_STATS_INC(mp, xs_ig_attrchg); > @@ -943,6 +944,7 @@ xfs_setattr_size( > > ASSERT(!(iattr->ia_valid & (ATTR_UID | ATTR_GID))); > setattr_copy(mnt_userns, inode, iattr); > + inode_inc_iversion(inode); > xfs_trans_log_inode(tp, ip, XFS_ILOG_CORE); > > XFS_STATS_INC(mp, xs_ig_attrchg); > @@ -1047,6 +1049,8 @@ xfs_vn_update_time( > inode->i_mtime = *now; > if (flags & S_ATIME) > inode->i_atime = *now; > + if ((flags & S_VERSION) && inode_maybe_inc_iversion(inode, false)) > + log_flags |= XFS_ILOG_CORE; > > xfs_trans_ijoin(tp, ip, XFS_ILOCK_EXCL); > xfs_trans_log_inode(tp, ip, log_flags); > -- > 2.37.2 > I have a test (details below) that shows an open issue with NFSv4.x + fscache where an xfs exported filesystem would trigger unnecessary over the wire READs after a umount/mount cycle of the NFS mount. I previously tracked this down to atime updates, but never followed through on any patch. Now that Jeff worked it out and this patch is under review, I built 5.19 vanilla, retested, then built 5.19 + this patch and verified the problem is fixed. You can add: Tested-by: Dave Wysochanski <dwysocha@xxxxxxxxxx> # ./t0_bz1913591.sh 4.1 xfs relatime Setting NFS vers=4.1 filesystem to xfs and mount options relatime,rw 0. On NFS server, setup export with xfs filesystem on loop device /dev/loop0 /export/dir1 xfs rw,seclabel,relatime,attr2,inode64,logbufs=8,logbsize=32k,noquota 0 0 1. On NFS client, install and enable cachefilesd 2. On NFS client, mount -o vers=4.1,fsc 127.0.0.1:/export/dir1 /mnt 3. On NFS client, dd if=/dev/zero of=/mnt/file1.bin bs=4096 count=1 4. On NFS client, echo 3 > /proc/sys/vm/drop_caches 5. On NFS client, dd if=/mnt/file1.bin of=/dev/null (read into fscache) 6. On NFS client, umount /mnt 7. On NFS client, mount -o vers=4.1,fsc 127.0.0.1:/export/dir1 /mnt 8. On NFS client, repeat steps 4-5 (read from fscache) 9. On NFS client, check for READ ops (1st number) > 0 in /proc/self/mountstats Found 4200 NFS READ and READ_PLUS ops in /proc/self/mountstats > 0 READ: 1 1 0 220 4200 0 0 1 0 READ_PLUS: 0 0 0 0 0 0 0 0 0 FAILED TEST ./t0_bz1913591.sh on kernel 5.19.0 with NFS vers=4.1 exported filesystem xfs options relatime,rw # ./t0_bz1913591.sh 4.1 xfs relatime Setting NFS vers=4.1 filesystem to xfs and mount options relatime,rw 0. On NFS server, setup export with xfs filesystem on loop device /dev/loop0 /export/dir1 xfs rw,seclabel,relatime,attr2,inode64,logbufs=8,logbsize=32k,noquota 0 0 1. On NFS client, install and enable cachefilesd 2. On NFS client, mount -o vers=4.1,fsc 127.0.0.1:/export/dir1 /mnt 3. On NFS client, dd if=/dev/zero of=/mnt/file1.bin bs=4096 count=1 4. On NFS client, echo 3 > /proc/sys/vm/drop_caches 5. On NFS client, dd if=/mnt/file1.bin of=/dev/null (read into fscache) 6. On NFS client, umount /mnt 7. On NFS client, mount -o vers=4.1,fsc 127.0.0.1:/export/dir1 /mnt 8. On NFS client, repeat steps 4-5 (read from fscache) 9. On NFS client, check for READ ops (1st number) > 0 in /proc/self/mountstats 10. On NFS client, check /proc/fs/fscache/stats fscache reads incrementing PASSED TEST ./t0_bz1913591.sh on kernel 5.19.0i_version+ with NFS vers=4.1 exported filesystem xfs options relatime,rw