On Thu, Dec 10, 2020 at 09:23:58AM -0500, Brian Foster wrote: > On Thu, Dec 10, 2020 at 07:51:32AM +1100, Dave Chinner wrote: > > For changing incompat log flags, covering also provides exactly what > > we need - an empty log with only a dirty superblock in the journal > > to recover. All that we need then is to recheck feature flags after > > recovery (not just clear log incompat flags) because we might need > > to use this to also set incompat feature flags dynamically. > > > > I'd love it if we use a better term for a log isolated to the superblock > buffer. So far we've discussed an empty log with a dummy superblock, an > empty log with a dirty superblock, and I suppose we could also have an > actual empty log. :P "Covered log," perhaps? On terminology: "covered log" has been around for 25 years in XFS, and it's very definition is "an empty, up-to-date log except for a dummy object we logged without any changes to update the log tail". And, by definition, that dummy object is dirty in the log even if it contains no actual modifications when it was logged. So in this case, "dummy" is directly interchangable with "dirty" when looking at the log contents w.r.t. recovery behaviour. It's worth looking at a historical change, too. Log covering originally used the root inode and not the superblock. That caused problems on linux dirtying VFS inode state, so we changed it to the superblock in commit 1a387d3be2b3 ("xfs: dummy transactions should not dirty VFS state") about a decade ago. Note the name of the function at the time (xfs_fs_log_dummy()) and that it's description explicitly mentions that a dummy transaction used for updating the log tail when covering it. The only difference in this discussion is fact that we may be replacing the "dummy" with a modified superblock. Both are still dirty superblock objects in the log, and serve the same purpose for covering the log, but instead of using a dummy object we update the log incompat flags so taht the only thing that gets replayed if we crash is the modification to the superblock flags... Finally, no, we can't have a truly empty log while the filesystem is mounted because log transaction records must not be empty. Further, we can only bring the log down to an "empty but not quite clean" state while mounted, because if the log is actually marked clean and we crash then log recovery will not run and so we will not clean up files that were open-but-unlinked when the system crashed. > > > > but didn't necessarily clean the log. I wonder if we can fit that state > > > > into the existing log covering mechanism by logging the update earlier > > > > (or maybe via an intermediate state..)? I'll need to go stare at that > > > > code a bit.. > > > > That's kinda what I'd like to see done - it gives us a generic way > > of altering superblock incompat feature fields and filesystem > > structures dynamically with no risk of log recovery needing to parse > > structures it may not know about. > > > > We might have to think about if we want to clear such feature bits in > all log covering scenarios (idle, freeze) vs. just unmount. I think clearing bits can probably be lazy. Set a state flag to tell log covering that it needs to do an update, and so when the covering code finally runs the feature bit modification just happens. > My previous suggestion in the other sub-thread was to set bits on > mount and clear on unmount because that (hopefully) simplifies the > broader implementation. Otherwise we need to manage dynamic > setting of bits when the associated log items are active, and that > still isn't truly representative because the bits would remain set > long after active items fall out of the log, until the log is > covered. Either way is possible, but I'm curious what others > think. I think that unless we are setting a log incompat flag, log covering should unconditionally clear the log incompat flags. Because the log is empty, and recovery of a superblock buffer should -always- be possible, then by definition we have no log incompat state present.... As for a mechanism for dynamically adding log incompat flags? Perhaps we just do that in xfs_trans_alloc() - add an log incompat flags field into the transaction reservation structure, and if xfs_trans_alloc() sees an incompat field set and the superblock doesn't have it set, the first thing it does is run a "set log incompat flag" transaction before then doing it's normal work... This should be rare enough it doesn't have any measurable performance overhead, and it's flexible enough to support any log incompat feature we might need to implement... Cheers, Dave. -- Dave Chinner david@xxxxxxxxxxxxx