Re: [PATCH] iversion: update comments with info about atime updates

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, 2022-08-24 at 08:24 +1000, NeilBrown wrote:
> On Tue, 23 Aug 2022, Jeff Layton wrote:
> > On Tue, 2022-08-23 at 21:38 +1000, NeilBrown wrote:
> > > On Tue, 23 Aug 2022, Jeff Layton wrote:
> > > > So, we can refer to that and simply say:
> > > > 
> > > > "If the function updates the mtime or ctime on the inode, then the
> > > > i_version should be incremented. If only the atime is being updated,
> > > > then the i_version should not be incremented. The exception to this rule
> > > > is explicit atime updates via utimes() or similar mechanism, which
> > > > should result in the i_version being incremented."
> > > 
> > > Is that exception needed? utimes() updates ctime.
> > > 
> > > https://man7.org/linux/man-pages/man2/utimes.2.html
> > > 
> > > doesn't say that, but
> > > 
> > > https://pubs.opengroup.org/onlinepubs/007904875/functions/utimes.html
> > > 
> > > does, as does the code.
> > > 
> > 
> > Oh, good point! I think we can leave that out. Even better!
> 
> Further, implicit mtime updates (file_update_time()) also update ctime.
> So all you need is
>  If the function updates the ctime, then i_version should be
>  incremented.
> 
> and I have to ask - why not just use the ctime? Why have another number
> that is parallel?
> 
> Timestamps are updated at HZ (ktime_get_course) which is at most every
> millisecond.
> xfs stores nanosecond resolution, so about 20 bits are currently wasted.
> We could put a counter like i_version in there that only increments
> after it is viewed, then we can get all the precision we need but with
> exactly ctime semantics.
> 
> The 64 change-id could comprise
>  35 bits of seconds (nearly a millenium)
>  16 bits of sub-seconds (just in case a higher precision time was wanted
>  one day)
>  13 bits of counter. - 8192 changes per tick

We'd need a "seen" flag too, so maybe only 4096 changes per tick...

> 
> The value exposed in i_ctime would hide the counter and just show the
> timestamp portion of what the filesystem stores. This would ensure we
> never get changes on different files that happen in one order leaving
> timestamps with the reversed order (the timestamps could be the same,
> but that is expected).
> 
> This scheme could be made to handle a sustained update rate of 1
> increment every 8 nanoseconds (if the counter were allowed to overflow
> into unused bits of the sub-second field). This is one ever 24 CPU
> cycles. Incrementing a counter and making it visible to all CPUs can
> probably be done in 24 cycles. Accessing it and setting the "seen" flag
> as well might just fit with faster memory. Getting any other useful
> work done while maintaining that rate on a single file seems unlikely.

This is an interesting idea.

So, for NFSv4 you'd just mask off the counter bits (and "seen" flag) to
get the ctime, and for the change attribute we'd just mask off the
"seen" flag and put it all in there.

 * Implementing that for all filesystems would be a huge project though.
   If we were implementing the i_version counter from scratch, I'd
   probably do something along these lines. Given that we already have
   an existing i_version counter, would there be any real benefit to
   pursuing this avenue instead?
-- 
Jeff Layton <jlayton@xxxxxxxxxx>




[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [NTFS 3]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [NTFS 3]     [Samba]     [Device Mapper]     [CEPH Development]

  Powered by Linux