On Sun, Apr 10, 2022 at 11:21:06AM -0700, Darrick J. Wong wrote: > From: Darrick J. Wong <djwong@xxxxxxxxxx> > > I've been observing periodic corruption reports from xfs_scrub involving > the free rt extent counter (frextents) while running xfs/141. That test > uses an error injection knob to induce a torn write to the log, and an > arbitrary number of recovery mounts, frextents will count fewer free rt > extents than can be found the rtbitmap. > > The root cause of the problem is a combination of the misuse of > sb_frextents in the incore mount to reflect both incore reservations > made by running transactions as well as the actual count of free rt > extents on disk. The following sequence can reproduce the undercount: > > Thread 1 Thread 2 > xfs_trans_alloc(rtextents=3) > xfs_mod_frextents(-3) > <blocks> > xfs_attr_set() > xfs_bmap_attr_addfork() > xfs_add_attr2() > xfs_log_sb() > xfs_sb_to_disk() > xfs_trans_commit() > <log flushed to disk> > <log goes down> > > Note that thread 1 subtracts 3 from sb_frextents even though it never > commits to using that space. Thread 2 writes the undercounted value to > the ondisk superblock and logs it to the xattr transaction, which is > then flushed to disk. At next mount, log recovery will find the logged > superblock and write that back into the filesystem. At the end of log > recovery, we reread the superblock and install the recovered > undercounted frextents value into the incore superblock. From that > point on, we've effectively leaked thread 1's transaction reservation. > > The correct fix for this is to separate the incore reservation from the > ondisk usage, but that's a matter for the next patch. Because the > kernel has been logging superblocks with undercounted frextents for a > very long time and we don't demand that sysadmins run xfs_repair after a > crash, fix the undercount by recomputing frextents after log recovery. > > Gating this on log recovery is a reasonable balance (I think) between > correcting the problem and slowing down every mount attempt. Note that > xfs_repair will fix undercounted frextents. > > Signed-off-by: Darrick J. Wong <djwong@xxxxxxxxxx> Looks good now! Reviewed-by: Dave Chinner <dchinner@xxxxxxxxxx> -- Dave Chinner david@xxxxxxxxxxxxx