On Tue, Mar 26, 2024 at 06:50:44PM -0700, Darrick J. Wong wrote: > From: Darrick J. Wong <djwong@xxxxxxxxxx> > > While reviewing the online fsck patchset, someone spied the > xfs_swapext_can_use_without_log_assistance function and wondered why we > go through this inverted-bitmask dance to avoid setting the > XFS_SB_FEAT_INCOMPAT_LOG_SWAPEXT feature. > > (The same principles apply to the logged extended attribute update > feature bit in the since-merged LARP series.) > > The reason for this dance is that xfs_add_incompat_log_feature is an > expensive operation -- it forces the log, pushes the AIL, and then if > nobody's beaten us to it, sets the feature bit and issues a synchronous > write of the primary superblock. That could be a one-time cost > amortized over the life of the filesystem, but the log quiesce and cover > operations call xfs_clear_incompat_log_features to remove feature bits > opportunistically. On a moderately loaded filesystem this leads to us > cycling those bits on and off over and over, which hurts performance. > > Why do we clear the log incompat bits? Back in ~2020 I think Dave and I > had a conversation on IRC[2] about what the log incompat bits represent. > IIRC in that conversation we decided that the log incompat bits protect > unrecovered log items so that old kernels won't try to recover them and > barf. Since a clean log has no protected log items, we could clear the > bits at cover/quiesce time. > > As Dave Chinner pointed out in the thread, clearing log incompat bits at > unmount time has positive effects for golden root disk image generator > setups, since the generator could be running a newer kernel than what > gets written to the golden image -- if there are log incompat fields set > in the golden image that was generated by a newer kernel/OS image > builder then the provisioning host cannot mount the filesystem even > though the log is clean and recovery is unnecessary to mount the > filesystem. > > Given that it's expensive to set log incompat bits, we really only want > to do that once per bit per mount. Therefore, I propose that we only > clear log incompat bits as part of writing a clean unmount record. Do > this by adding an operational state flag to the xfs mount that guards > whether or not the feature bit clearing can actually take place. > > This eliminates the l_incompat_users rwsem that we use to protect a log > cleaning operation from clearing a feature bit that a frontend thread is > trying to set -- this lock adds another way to fail w.r.t. locking. For > the swapext series, I shard that into multiple locks just to work around > the lockdep complaints, and that's fugly. > > Link: https://lore.kernel.org/linux-xfs/20240131230043.GA6180@frogsfrogsfrogs/ > Signed-off-by: Darrick J. Wong <djwong@xxxxxxxxxx> > Reviewed-by: Christoph Hellwig <hch@xxxxxx> > --- > .../filesystems/xfs/xfs-online-fsck-design.rst | 3 - > fs/xfs/xfs_log.c | 28 ------------- > fs/xfs/xfs_log.h | 2 - > fs/xfs/xfs_log_priv.h | 3 - > fs/xfs/xfs_log_recover.c | 15 ------- > fs/xfs/xfs_mount.c | 8 +++- > fs/xfs/xfs_mount.h | 6 ++- > fs/xfs/xfs_xattr.c | 42 +++----------------- > 8 files changed, 19 insertions(+), 88 deletions(-) Looks fine. Reviewed-by: Dave Chinner <dchinner@xxxxxxxxxx> -- Dave Chinner david@xxxxxxxxxxxxx