On Mon, 2022-09-12 at 08:54 -0400, J. Bruce Fields wrote: > On Mon, Sep 12, 2022 at 07:42:16AM -0400, Jeff Layton wrote: > > A scheme like that could work. It might be hard to do it without a > > spinlock or something, but maybe that's ok. Thinking more about how we'd > > implement this in the underlying filesystems: > > > > To do this we'd need 2 64-bit fields in the on-disk and in-memory > > superblocks for ext4, xfs and btrfs. On the first mount after a crash, > > the filesystem would need to bump s_version_max by the significant > > increment (2^40 bits or whatever). On a "clean" mount, it wouldn't need > > to do that. > > > > Would there be a way to ensure that the new s_version_max value has made > > it to disk? Bumping it by a large value and hoping for the best might be > > ok for most cases, but there are always outliers, so it might be > > worthwhile to make an i_version increment wait on that if necessary. > > I was imagining that when you recognize you're getting close, you kick > off something which writes s_version_max+2^40 to disk, and then updates > s_version_max to that new value on success of the write. > Ok, that makes sense. > The code that increments i_version checks to make sure it wouldn't > exceed s_version_max. If it would, something has gone wrong--a write > has failed or taken a long time--so it waits or errors out or something, > depending on desired filesystem behavior in that case. > Maybe could just throw a big scary pr_warn too? I'd have to think about how we'd want to handle this case. > No locking required in the normal case? Yeah, maybe not. -- Jeff Layton <jlayton@xxxxxxxxxx>