Re: replacement i_version counter for xfs

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Jan 24, 2023 at 07:56:09AM -0500, Jeff Layton wrote:
> A few months ago, I posted a patch to make xfs not bump its i_version
> counter on atime updates. Dave Chinner NAK'ed that patch, mentioning
> that xfs would need to replace it with an entirely new field as the
> existing counter is used for other purposes and its semantics are set in
> stone.
> 
> Has anything been done toward that end?

No, because we don't have official specification of the behaviour
the nfsd subsystem requires merged into the kernel yet.

> Should I file a bug report or something?

There's nothing we can really do until the new specification is set
in stone. Filing a bug report won't change anything material.

As it is, I'm guessing that you desire the behaviour to be as you
described in the iversion patchset you just posted. That is
effectively:

  * The change attribute (i_version) is mandated by NFSv4 and is mostly for
  * knfsd, but is also used for other purposes (e.g. IMA). The i_version must
- * appear different to observers if there was a change to the inode's data or
- * metadata since it was last queried.
+ * appear larger to observers if there was an explicit change to the inode's
+ * data or metadata since it was last queried.

i.e. the definition is changing from *any* metadata or data change
to *explicit* metadata/data changes, right? i.e. it should only
change when ctime changes?

IIUC the rest of the justification for i_version is that ctime might
lack the timestamp granularity to disambiguate sub-timestamp
granularity changes, so i_version is needed to bridge that gap.

Given that XFS has nanosecond timestamp resolution in the on-disk
format, both i_version and ctime changes are journalled, and
ctime/i_version will always change at exactly the same time in the
same transactions, there are no inherent sub-timestamp granularity
problems with ctime within XFS. Any deficiency in ctime resolution
comes solely from the granularity of the VFS inode timestamp
functions.

And so if current_time() was to provide fine-grained nanosecond
timestamp resolution for exported XFS filesystems (i.e. use
ktime_get_real_ts64() conditionally), then it seems to me that the
nfsd i_version function becomes completely redundant.

i.e. we are pretty much guaranteed that ctime on exported
filesystems will always be different for explicit modifications to
the same inode, and hence we can just use ctime as the version
change identifier without needing any on-disk format changes at all.

And we can optimise away that overhead when the filesystem is not
exported by just using the coarse timestamps because there is no
need for sub-timer-tick disambiguation of single file
modifications....

Hence it appears to me that with the new i_version specification
that there's an avenue out of this problem entirely that is "nfsd
needs to use ctime, not i_version". This solution seems generic
enough that filesystems with existing on-disk nanosecond timestamp
granularity would no longer need explicit on-disk support for the
nfsd i_version functionality, yes?

Cheers,

Dave.
-- 
Dave Chinner
david@xxxxxxxxxxxxx



[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [NTFS 3]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [NTFS 3]     [Samba]     [Device Mapper]     [CEPH Development]

  Powered by Linux