On Tue, 2022-09-27 at 08:43 +1000, NeilBrown wrote: > On Fri, 23 Sep 2022, Jeff Layton wrote: > > > > Absolutely. That is the downside of this approach, but the priority here > > has always been to improve nfsd. If we don't get the ability to present > > this info via statx, then so be it. Later on, I suppose we can move that > > handling into the kernel in some fashion if we decide it's worthwhile. > > > > That said, not having this in statx makes it more difficult to test > > i_version behavior. Maybe we can add a generic ioctl for that in the > > interim? > > I wonder if we are over-thinking this, trying too hard, making "perfect" > the enemy of "good". I tend to think we are. > While we agree that the current implementation of i_version is > imperfect, it isn't causing major data corruption all around the world. > I don't think there are even any known bug reports are there? > So while we do want to fix it as best we can, we don't need to make that > the first priority. > I'm not aware of any bug reports aside from the issue of atime updates affecting the change attribute, but the effects of misbehavior here can be very subtle. > I think the first priority should be to document how we want it to work, > which is what this thread is really all about. The documentation can > note that some (all) filesystems do not provide perfect semantics across > unclean restarts, and can list any other anomalies that we are aware of. > And on that basis we can export the current i_version to user-space via > statx and start trying to write some test code. > > We can then look at moving the i_version/ctime update from *before* the > write to *after* the write, and any other improvements that can be > achieved easily in common code. We can then update the man page to say > "since Linux 6.42, this list of anomalies is no longer present". > I have a patch for this for ext4, and started looking at the same for btrfs and xfs. > Then we can explore some options for handling unclean restart - in a > context where we can write tests and maybe even demonstrate a concrete > problem before we start trying to fix it. > I think too that we need to recognize that there are multiple distinct issues around i_version handling: 1/ atime updates affecting i_version in ext4 and xfs, which harms performance 2/ ext4 should enable the change attribute by default 3/ we currently mix the ctime into the change attr for directories, which is unnecessary. 4/ we'd like to be able to report NFS4_CHANGE_TYPE_IS_MONOTONIC_INCR from nfsd, but the change attr on regular files can appear to go backward after a crash+clock jump. 5/ testing i_version behavior is very difficult since there is no way to query it from userland. We can work on the first three without having to solve the last two right away. -- Jeff Layton <jlayton@xxxxxxxxxx>