On Fri, 09 Sep 2022, Theodore Ts'o wrote: > On Thu, Sep 08, 2022 at 10:33:26AM +0200, Jan Kara wrote: > > It boils down to the fact that we don't want to call mark_inode_dirty() > > from IOCB_NOWAIT path because for lots of filesystems that means journal > > operation and there are high chances that may block. > > > > Presumably we could treat inode dirtying after i_version change similarly > > to how we handle timestamp updates with lazytime mount option (i.e., not > > dirty the inode immediately but only with a delay) but then the time window > > for i_version inconsistencies due to a crash would be much larger. > > Perhaps this is a radical suggestion, but there seems to be a lot of > the problems which are due to the concern "what if the file system > crashes" (and so we need to worry about making sure that any > increments to i_version MUST be persisted after it is incremented). > > Well, if we assume that unclean shutdowns are rare, then perhaps we > shouldn't be optimizing for that case. So.... what if a file system > had a counter which got incremented each time its journal is replayed > representing an unclean shutdown. That shouldn't happen often, but if > it does, there might be any number of i_version updates that may have > gotten lost. So in that case, the NFS client should invalidate all of > its caches. I was also thinking that the filesystem could help close that gap, but I didn't like the "whole filesysem is dirty" approach. I instead imagined a "dirty" bit in the on-disk inode which was set soon after any open-for-write and cleared when the inode was finally written after there are no active opens and no unflushed data. The "soon after" would set a maximum window on possible lost version updates (which people seem to have comfortable with) without imposing a sync IO operation on open (for first write). When loading an inode from disk, if the dirty flag was set then the difference between current time and on-disk ctime (in nanoseconds) could be added to the version number. But maybe that is too complex for the gain. NeilBrown