On Mon, Sep 12, 2022 at 07:42:16AM -0400, Jeff Layton wrote: > A scheme like that could work. It might be hard to do it without a > spinlock or something, but maybe that's ok. Thinking more about how we'd > implement this in the underlying filesystems: > > To do this we'd need 2 64-bit fields in the on-disk and in-memory > superblocks for ext4, xfs and btrfs. On the first mount after a crash, > the filesystem would need to bump s_version_max by the significant > increment (2^40 bits or whatever). On a "clean" mount, it wouldn't need > to do that. > > Would there be a way to ensure that the new s_version_max value has made > it to disk? Bumping it by a large value and hoping for the best might be > ok for most cases, but there are always outliers, so it might be > worthwhile to make an i_version increment wait on that if necessary. I was imagining that when you recognize you're getting close, you kick off something which writes s_version_max+2^40 to disk, and then updates s_version_max to that new value on success of the write. The code that increments i_version checks to make sure it wouldn't exceed s_version_max. If it would, something has gone wrong--a write has failed or taken a long time--so it waits or errors out or something, depending on desired filesystem behavior in that case. No locking required in the normal case? --b.