On Tue, 2023-10-31 at 13:22 +0100, Jan Kara wrote: > On Tue 31-10-23 07:04:53, Jeff Layton wrote: > > On Tue, 2023-10-31 at 09:37 +1100, Dave Chinner wrote: > > > I have suggested mechanisms for using masked off bits of timestamps > > > to encode sub-timestamp granularity change counts and keep them > > > invisible to userspace and then not using i_version at all for XFS. > > > This avoids all the problems that the multi-grain timestamp > > > infrastructure exposed due to variable granularity of user visible > > > timestamps and ordering across inodes with different granularity. > > > This is potentially a general solution, too. > > > > I don't really understand this at all, but trying to do anything with > > fine-grained timestamps will just run into a lot of the same problems we > > hit with the multigrain work. If you still see this as a path forward, > > maybe you can describe it more detail? > > Dave explained a bit more details here [1] like: > > Another options is for XFS to play it's own internal tricks with > [cm]time granularity and turn off i_version. e.g. limit external > timestamp visibility to 1us and use the remaining dozen bits of the > ns field to hold a change counter for updates within a single coarse > timer tick. This guarantees the timestamp changes within a coarse > tick for the purposes of change detection, but we don't expose those > bits to applications so applications that compare timestamps across > inodes won't get things back to front like was happening with the > multi-grain timestamps.... > - > > So as far as I understand Dave wants to effectively persist counter in low > bits of ctime and expose ctime+counter as its change cookie. I guess that > could work and what makes the complexity manageable compared to full > multigrain timestamps is the fact that we have one filesystem, one on-disk > format etc. The only slight trouble could be that if we previously handed > out something in low bits of ctime for XFS, we need to keep handing the > same thing out until the inode changes (i.e., no rounding until the moment > inode changes) as the old timestamp could be stored somewhere externally > and compared. > > Honza > > [1] https://lore.kernel.org/all/ZTjMRRqmlJ+fTys2@xxxxxxxxxxxxxxxxxxx/ > > Got it. That makes sense and could probably be made to work. Doing that all in XFS won't be simple though. You'll need to reimplement stuff like file_modified() and file_update_time(). Those get called from deep within the VFS and from page fault handlers. FWIW, that's the main reason the multigrain work was so invasive, even though it was a filesystem-specific feature. -- Jeff Layton <jlayton@xxxxxxxxxx>