On Fri, 09 Sep 2022, Jeff Layton wrote: > On Fri, 2022-09-09 at 09:01 +1000, NeilBrown wrote: > > On Fri, 09 Sep 2022, Jeff Layton wrote: > > > On Thu, 2022-09-08 at 14:22 -0400, J. Bruce Fields wrote: > > > > On Thu, Sep 08, 2022 at 01:40:11PM -0400, Jeff Layton wrote: > > > > > Yeah, ok. That does make some sense. So we would mix this into the > > > > > i_version instead of the ctime when it was available. Preferably, we'd > > > > > mix that in when we store the i_version rather than adding it afterward. > > > > > > > > > > Ted, how would we access this? Maybe we could just add a new (generic) > > > > > super_block field for this that ext4 (and other filesystems) could > > > > > populate at mount time? > > > > > > > > Couldn't the filesystem just return an ino_version that already includes > > > > it? > > > > > > > > > > Yes. That's simple if we want to just fold it in during getattr. If we > > > want to fold that into the values stored on disk, then I'm a little less > > > clear on how that will work. > > > > > > Maybe I need a concrete example of how that will work: > > > > > > Suppose we have an i_version value X with the previous crash counter > > > already factored in that makes it to disk. We hand out a newer version > > > X+1 to a client, but that value never makes it to disk. > > > > As I understand it, the crash counter would NEVER appear in the on-disk > > i_version. > > The crash counter is stable while a filesystem is mounted so is the same > > when loading an inode from disk and when writing it back. > > > > When loading, add crash counter to on-disk i_version to provide > > in-memory i_version. > > when storing, subtract crash counter from in-memory i_version to provide > > on-disk i_version. > > > > "add" and "subtract" could be any reversible hash, and its inverse. I > > would probably shift the crash counter up 16 and add/subtract. > > > > > > If you store the value with the crash counter already factored-in, then > not every inode would end up being invalidated after a crash. If we try > to mix it in later, the client will end up invalidating the cache even > for inodes that had no changes. How do we know which inodes need the crash counter merged in? I thought the whole point of the crash counter was that it affected every file (easy, safe, expensive, but hopefully rare enough that the expense could be justified). NeilBrown > > > > > > > The machine crashes and comes back up, and we get a query for i_version > > > and it comes back as X. Fine, it's an old version. Now there is a write. > > > What do we do to ensure that the new value doesn't collide with X+1? > > > -- > > > Jeff Layton <jlayton@xxxxxxxxxx> > > > > > -- > Jeff Layton <jlayton@xxxxxxxxxx> >