Re: xfs_clear_incompat_log_features considered harmful?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Jan 31, 2024 at 03:00:43PM -0800, Darrick J. Wong wrote:
> Hi everyone,
> 
> Christoph spied the xfs_swapext_can_use_without_log_assistance
> function[0] in the atomic file updates patchset[1] and wondered why we
> go through this inverted-bitmask dance to avoid setting the
> XFS_SB_FEAT_INCOMPAT_LOG_SWAPEXT feature.
> 
> (The same principles apply to xfs_attri_can_use_without_log_assistance
> from the since-merged LARP series.)
> 
> The reason for this dance is that xfs_add_incompat_log_feature is an
> expensive operation -- it forces the log, pushes the AIL, and then if
> nobody's beaten us to it, sets the feature bit and issues a synchronous
> write of the primary superblock.  That could be a one-time cost
> amortized over the life of the filesystem, but the log quiesce and cover
> operations call xfs_clear_incompat_log_features to remove feature bits
> opportunistically.  On a moderately loaded filesystem this leads to us
> cycling those bits on and off over and over, which hurts performance.
> 
> Why do we clear the log incompat bits?  Back in ~2020 I think Dave and I
> had a conversation on IRC[2] about what the log incompat bits represent.
> IIRC in that conversation we decided that the log incompat bits protect
> unrecovered log items so that old kernels won't try to recover them and
> barf.  Since a clean log has no protected log items, we could clear the
> bits at cover/quiesce time.  At the time, I think we decided to go with
> this idea because we didn't like the idea of reducing the span of
> kernels that can recover a filesystem over the lifetime of that
> filesystem.

I don't think that was the issue - it was a practical concern that
having unnecessary log incompat fields set would prevent the common
practice of provisioning VMs with newer kernels than the host is
running.

The issue arises if the host tries to mount the guest VM image to
configure the clone of a golden image prior to first start. If there
are log incompat fields set in the golden image that was generated
by a newer kernel/OS image builder then the provisioning
host cannot mount the filesystem even though the log is clean and
recovery is unnecessary to mount the filesystem.

Hence on unmount we really want the journal contents based log
incompat bits cleared because there is nothing incompatible in the
log and so there is no reason to prevent older kernels from
mounting the filesytsem.

> 
> [ISTR Eric pointing out at some point that adding incompat feature bits
> at runtime could confuse users who crash and try to recover with Ye Olde
> Knoppix CD because there's now a log incompat bit set that wasn't there
> at format time, but my memory is a bit hazy.]
> 
> Christoph wondered why I don't just set the log incompat bits at mkfs
> time, especially for filesystems that have some newer feature set (e.g.
> parent pointers, metadir, rtgroups...) to avoid the runtime cost of
> adding the feature flag.  I don't bother with that because of the log
> clearing behavior.  He also noted that _can_use_without_log_assistance
> is potentially dangerous if distro vendors backport features to old
> kernels in a different order than they land in upstream.

This is what incompat feature bits are for, not log incompat bits.
We don't need log incompat bits for pp, metaddir, rtgroups, etc
because older kernels won't even get to log recovery as they see
incompat feature bits set when they first read the superblock.

IOWs, a log incompat flag should only be necessary to prevent log
recovery on older kernels if we change how something is logged
without otherwise changing the on disk format of those items (e.g.
LARP). There are no incompat feature bits that are set in these
cases, so we need a log incompat bit to be set whilst the journal
has those items in it....


> Another problem with this scheme is the l_incompat_users rwsem that we
> use to protect a log cleaning operation from clearing a feature bit that
> a frontend thread is trying to set -- this lock adds another way to fail
> w.r.t. locking.  For the swapext series, I shard that into multiple
> locks just to work around the lockdep complaints, and that's fugly.

We can avoid that simply by not clearing the incompat bit at cover
time, and instead do it only at unmount. Then it only gets set once
per mount, and only gets cleared when we are running single threaded
on unmount. No need for locking at this point holding the superblock
buffer locked will serialise feature bit addition....


> Going forward, I'd make mkfs set the log incompat features during a
> fresh format if any of the currently-undefined feature bits are set,
> which means that they'll be enabled by default on any filesystem with
> directory parent pointers and/or metadata directories.  I'd also add
> mkfs -l options so that sysadmins can turn it on at format time.

At this point they are just normal incompat feature bits. Why not
just use those instead? i.e. Why do we need log incompat bits to be
permanently sticky when we've already got incompat feature bits for
that?

Cheers,

Dave.
-- 
Dave Chinner
david@xxxxxxxxxxxxx




[Index of Archives]     [XFS Filesystem Development (older mail)]     [Linux Filesystem Development]     [Linux Audio Users]     [Yosemite Trails]     [Linux Kernel]     [Linux RAID]     [Linux SCSI]


  Powered by Linux