On Thu, 2023-04-27 at 11:51 +0200, Christian Brauner wrote: > On Wed, Apr 26, 2023 at 05:48:38AM -0400, Jeff Layton wrote: > > On Wed, 2023-04-26 at 09:07 +0200, Christian Brauner wrote: > > > On Mon, Apr 24, 2023 at 11:11:02AM -0400, Jeff Layton wrote: > > > > The VFS always uses coarse-grained timestamp updates for filling out the > > > > ctime and mtime after a change. This has the benefit of allowing > > > > filesystems to optimize away a lot metaupdates, to around once per > > > > jiffy, even when a file is under heavy writes. > > > > > > > > Unfortunately, this has always been an issue when we're exporting via > > > > NFSv3, which relies on timestamps to validate caches. Even with NFSv4, a > > > > lot of exported filesystems don't properly support a change attribute > > > > and are subject to the same problems with timestamp granularity. Other > > > > applications have similar issues (e.g backup applications). > > > > > > > > Switching to always using fine-grained timestamps would improve the > > > > situation for NFS, but that becomes rather expensive, as the underlying > > > > filesystem will have to log a lot more metadata updates. > > > > > > > > What we need is a way to only use fine-grained timestamps when they are > > > > being actively queried: > > > > > > > > Whenever the mtime changes, the ctime must also change since we're > > > > changing the metadata. When a superblock has a s_time_gran >1, we can > > > > use the lowest-order bit of the inode->i_ctime as a flag to indicate > > > > that the value has been queried. Then on the next write, we'll fetch a > > > > fine-grained timestamp instead of the usual coarse-grained one. > > > > > > > > We could enable this for any filesystem that has a s_time_gran >1, but > > > > for now, this patch adds a new SB_MULTIGRAIN_TS flag to allow filesystems > > > > to opt-in to this behavior. > > > > > > Hm, the patch raises the flag in s_flags. Please at least move this to > > > s_iflags as SB_I_MULTIGRAIN and treat this as an internal flag. There's > > > no need to give the impression that this will become a mount option. > > > > > > Also, this looks like it's a filesystem property not a superblock > > > property as the granularity isn't changeable. So shouldn't this be an > > > FS_* flag instead? > > > > It could be a per-sb thing if there was some filesystem that wanted to > > do that, but I'm hoping that most will not want to do that. > > Yeah, I'd really hope this isn't an sb thing. > > > > > My initial patches for this actually did use a FS_* flag, but I figured > > Oh, I might've just missed that. > Sorry, I didn't actually post that set. But I did go with a FS_* flag before I made it a SB_* flag. > > that was one more pointer to chase when you wanted to check the flag. > > Hm, unless you have reasons to think that it would be noticable in terms > of perf I'd rather do the correct thing and have it be an FS_* flag. Sure. I'll make the switch before the next posting. Thanks for the review! -- Jeff Layton <jlayton@xxxxxxxxxx>