Re: xfs_clear_incompat_log_features considered harmful?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Feb 05, 2024 at 12:09:23PM +1100, Dave Chinner wrote:
> On Wed, Jan 31, 2024 at 03:00:43PM -0800, Darrick J. Wong wrote:
> > Hi everyone,
> > 
> > Christoph spied the xfs_swapext_can_use_without_log_assistance
> > function[0] in the atomic file updates patchset[1] and wondered why we
> > go through this inverted-bitmask dance to avoid setting the
> > XFS_SB_FEAT_INCOMPAT_LOG_SWAPEXT feature.
> > 
> > (The same principles apply to xfs_attri_can_use_without_log_assistance
> > from the since-merged LARP series.)
> > 
> > The reason for this dance is that xfs_add_incompat_log_feature is an
> > expensive operation -- it forces the log, pushes the AIL, and then if
> > nobody's beaten us to it, sets the feature bit and issues a synchronous
> > write of the primary superblock.  That could be a one-time cost
> > amortized over the life of the filesystem, but the log quiesce and cover
> > operations call xfs_clear_incompat_log_features to remove feature bits
> > opportunistically.  On a moderately loaded filesystem this leads to us
> > cycling those bits on and off over and over, which hurts performance.
> > 
> > Why do we clear the log incompat bits?  Back in ~2020 I think Dave and I
> > had a conversation on IRC[2] about what the log incompat bits represent.
> > IIRC in that conversation we decided that the log incompat bits protect
> > unrecovered log items so that old kernels won't try to recover them and
> > barf.  Since a clean log has no protected log items, we could clear the
> > bits at cover/quiesce time.  At the time, I think we decided to go with
> > this idea because we didn't like the idea of reducing the span of
> > kernels that can recover a filesystem over the lifetime of that
> > filesystem.
> 
> I don't think that was the issue - it was a practical concern that
> having unnecessary log incompat fields set would prevent the common
> practice of provisioning VMs with newer kernels than the host is
> running.

aha, thanks for refreshing my memory.

> The issue arises if the host tries to mount the guest VM image to
> configure the clone of a golden image prior to first start. If there
> are log incompat fields set in the golden image that was generated
> by a newer kernel/OS image builder then the provisioning
> host cannot mount the filesystem even though the log is clean and
> recovery is unnecessary to mount the filesystem.
> 
> Hence on unmount we really want the journal contents based log
> incompat bits cleared because there is nothing incompatible in the
> log and so there is no reason to prevent older kernels from
> mounting the filesytsem.

<nod> Will it be a problem if we crash with a log incompat bit set but
none of the actual log items that it protects in the ondisk log?

> > [ISTR Eric pointing out at some point that adding incompat feature bits
> > at runtime could confuse users who crash and try to recover with Ye Olde
> > Knoppix CD because there's now a log incompat bit set that wasn't there
> > at format time, but my memory is a bit hazy.]
> > 
> > Christoph wondered why I don't just set the log incompat bits at mkfs
> > time, especially for filesystems that have some newer feature set (e.g.
> > parent pointers, metadir, rtgroups...) to avoid the runtime cost of
> > adding the feature flag.  I don't bother with that because of the log
> > clearing behavior.  He also noted that _can_use_without_log_assistance
> > is potentially dangerous if distro vendors backport features to old
> > kernels in a different order than they land in upstream.
> 
> This is what incompat feature bits are for, not log incompat bits.
> We don't need log incompat bits for pp, metaddir, rtgroups, etc
> because older kernels won't even get to log recovery as they see
> incompat feature bits set when they first read the superblock.
> 
> IOWs, a log incompat flag should only be necessary to prevent log
> recovery on older kernels if we change how something is logged
> without otherwise changing the on disk format of those items (e.g.
> LARP). There are no incompat feature bits that are set in these
> cases, so we need a log incompat bit to be set whilst the journal
> has those items in it....

Ok.

> > Another problem with this scheme is the l_incompat_users rwsem that we
> > use to protect a log cleaning operation from clearing a feature bit that
> > a frontend thread is trying to set -- this lock adds another way to fail
> > w.r.t. locking.  For the swapext series, I shard that into multiple
> > locks just to work around the lockdep complaints, and that's fugly.
> 
> We can avoid that simply by not clearing the incompat bit at cover
> time, and instead do it only at unmount. Then it only gets set once
> per mount, and only gets cleared when we are running single threaded
> on unmount. No need for locking at this point holding the superblock
> buffer locked will serialise feature bit addition....

Ok.  I can live with that -- clearing the log-incompat bits on a clean
unmount, and leaving them set at cleaning time.  I think this means
l_incompat_users can go away too.

> > Going forward, I'd make mkfs set the log incompat features during a
> > fresh format if any of the currently-undefined feature bits are set,
> > which means that they'll be enabled by default on any filesystem with
> > directory parent pointers and/or metadata directories.  I'd also add
> > mkfs -l options so that sysadmins can turn it on at format time.
> 
> At this point they are just normal incompat feature bits. Why not
> just use those instead? i.e. Why do we need log incompat bits to be
> permanently sticky when we've already got incompat feature bits for
> that?

Without the "clear log incompat on clean" behavior, I don't think we
need anything I wrote about in that paragraph anymore either.

So, if I'm reading you both correctly, I'll:

1. change the log incompat handling code to clear them on umount only;
2. add a mount option to allow admins to permit setting of log incompat
   flags;
3. leave the xfs_{swapext,attri}_can_use_without_log_assistance
   functions as they are in djwong-wtf so that new incompat filesystem
   features will eliminate the need for setting/clearing the log
   incompat flags

Did I get that right?

--D

> Cheers,
> 
> Dave.
> -- 
> Dave Chinner
> david@xxxxxxxxxxxxx
> 




[Index of Archives]     [XFS Filesystem Development (older mail)]     [Linux Filesystem Development]     [Linux Audio Users]     [Yosemite Trails]     [Linux Kernel]     [Linux RAID]     [Linux SCSI]


  Powered by Linux