On Mon, Jun 12, 2023 at 01:08:11PM +0200, Jan Kara wrote: > On Sun 11-06-23 20:15:22, Darrick J. Wong wrote: > > From: Darrick J. Wong <djwong@xxxxxxxxxx> > > > > Userspace can freeze a filesystem using the FIFREEZE ioctl or by > > suspending the block device; this state persists until userspace thaws > > the filesystem with the FITHAW ioctl or resuming the block device. > > Since commit 18e9e5104fcd ("Introduce freeze_super and thaw_super for > > the fsfreeze ioctl") we only allow the first freeze command to succeed. > > > > The kernel may decide that it is necessary to freeze a filesystem for > > its own internal purposes, such as suspends in progress, filesystem fsck > > activities, or quiescing a device prior to removal. Userspace thaw > > commands must never break a kernel freeze, and kernel thaw commands > > shouldn't undo userspace's freeze command. > > > > Introduce a couple of freeze holder flags and wire it into the > > sb_writers state. One kernel and one userspace freeze are allowed to > > coexist at the same time; the filesystem will not thaw until both are > > lifted. > > > > I wonder if the f2fs/gfs2 code should be using a kernel freeze here, but > > for now we'll use FREEZE_HOLDER_USERSPACE to preserve existing > > behaviors. > > > > Cc: mcgrof@xxxxxxxxxx > > Cc: jack@xxxxxxx > > Cc: hch@xxxxxxxxxxxxx > > Cc: ruansy.fnst@xxxxxxxxxxx > > Signed-off-by: Darrick J. Wong <djwong@xxxxxxxxxx> > > Thanks Darrick. Some comments below. > > > +static int freeze_frozen_super(struct super_block *sb, enum freeze_holder who) > > +{ > > + /* Someone else already holds this type of freeze */ > > + if (sb->s_writers.freeze_holders & who) > > + return -EBUSY; > > + > > + WARN_ON(sb->s_writers.freeze_holders == 0); > > + > > + sb->s_writers.freeze_holders |= who; > > + return 0; > > +} > > + > > /** > > * freeze_super - lock the filesystem and force it into a consistent state > > * @sb: the super to lock > > + * @who: FREEZE_HOLDER_USERSPACE if userspace wants to freeze the fs; > > + * FREEZE_HOLDER_KERNEL if the kernel wants to freeze it > > * > > * Syncs the super to make sure the filesystem is consistent and calls the fs's > > - * freeze_fs. Subsequent calls to this without first thawing the fs will return > > + * freeze_fs. Subsequent calls to this without first thawing the fs may return > > * -EBUSY. > > * > > + * The @who argument distinguishes between the kernel and userspace trying to > > + * freeze the filesystem. Although there cannot be multiple kernel freezes or > > + * multiple userspace freezes in effect at any given time, the kernel and > > + * userspace can both hold a filesystem frozen. The filesystem remains frozen > > + * until there are no kernel or userspace freezes in effect. > > + * > > * During this function, sb->s_writers.frozen goes through these values: > > * > > * SB_UNFROZEN: File system is normal, all writes progress as usual. > > @@ -1668,12 +1688,19 @@ static void sb_freeze_unlock(struct super_block *sb, int level) > > * > > * sb->s_writers.frozen is protected by sb->s_umount. > > */ > > -int freeze_super(struct super_block *sb) > > +int freeze_super(struct super_block *sb, enum freeze_holder who) > > { > > int ret; > > > > atomic_inc(&sb->s_active); > > down_write(&sb->s_umount); > > + > > + if (sb->s_writers.frozen == SB_FREEZE_COMPLETE) { > > + ret = freeze_frozen_super(sb, who); > > + deactivate_locked_super(sb); > > + return ret; > > + } > > I find it a little bit odd that the second freeze holder does not get the > active superblock reference. It all looks correct but I'd find it easier to > reason about (and also eventually lift the reference counting out of > freeze_super()) if the rule was: Successful freeze_super() <=> you have > s_active reference. Ok, I'll keep the active ref when a freezer starts sharing a freeze. > > + > > if (sb->s_writers.frozen != SB_UNFROZEN) { > > I still find it strange that: > > Task1 Task2 > > while (1) { while (1) { > ioctl(f, FIFREEZE); freeze_super(sb, FREEZE_HOLDER_KERNEL); > ioctl(f, FITHAW); thaw_super(sb, FREEZE_HOLDER_KERNEL); > } } > > will randomly end up returning EBUSY to Task1 or Task2 although there is no > real conflict. I think it will be much more useful behavior if in case of > this conflict the second holder just waited for freezing procedure to finish > and then report success. Because I don't think either caller can do > anything sensible with this race other than retry but it cannot really > distinguish EBUSY as in "someone other holder of the same type has the sb > already frozen" from "freezing raced with holder of a different type". <nod> I'll copy this justification into the commit message for the second patch. > > > deactivate_locked_super(sb); > > return -EBUSY; > > @@ -1684,8 +1711,10 @@ int freeze_super(struct super_block *sb) > > return 0; /* sic - it's "nothing to do" */ > > } > > > > + > > Why the extra empty line? Whitespace damage. > > if (sb_rdonly(sb)) { > > /* Nothing to do really... */ > > + sb->s_writers.freeze_holders |= who; > > sb->s_writers.frozen = SB_FREEZE_COMPLETE; > > up_write(&sb->s_umount); > > return 0; > > @@ -1731,6 +1760,7 @@ int freeze_super(struct super_block *sb) > > * For debugging purposes so that fs can warn if it sees write activity > > * when frozen is set to SB_FREEZE_COMPLETE, and for thaw_super(). > > */ > > + sb->s_writers.freeze_holders |= who; > > sb->s_writers.frozen = SB_FREEZE_COMPLETE; > > lockdep_sb_freeze_release(sb); > > up_write(&sb->s_umount); > > @@ -1738,16 +1768,47 @@ int freeze_super(struct super_block *sb) > > } > > EXPORT_SYMBOL(freeze_super); > > > > -static int thaw_super_locked(struct super_block *sb) > > +static int try_thaw_shared_super(struct super_block *sb, enum freeze_holder who) > > +{ > > + /* Freeze is not held by this type? */ > > + if (!(sb->s_writers.freeze_holders & who)) > > + return -EINVAL; > > + > > + /* Also frozen for someone else? */ > > + if (sb->s_writers.freeze_holders & ~who) { > > + sb->s_writers.freeze_holders &= ~who; > > + return 0; > > + } > > + > > + /* Magic value to proceed with thaw */ > > + return 1; > > +} > > + > > +/* > > + * Undoes the effect of a freeze_super_locked call. If the filesystem is > > + * frozen both by userspace and the kernel, a thaw call from either source > > + * removes that state without releasing the other state or unlocking the > > + * filesystem. > > + */ > > +static int thaw_super_locked(struct super_block *sb, enum freeze_holder who) > > { > > int error; > > > > + if (sb->s_writers.frozen == SB_FREEZE_COMPLETE) { > > + error = try_thaw_shared_super(sb, who); > > + if (error != 1) { > > + up_write(&sb->s_umount); > > + return error; > > + } > > + } > > + > > if (sb->s_writers.frozen != SB_FREEZE_COMPLETE) { > > up_write(&sb->s_umount); > > return -EINVAL; > > } > > I'd first check for the above condition and then just fold > try_thaw_shared_super() into here. That way you can avoid the odd special > return and the code will be actually more readable. Probably we should grow > out_err label for: > > up_write(&sb->s_umount); > return error; > > and use it for the error returns as well... <shrug> we're only adding one of these, but I'll tack on a fourth patch to clean these up. --D > Honza > -- > Jan Kara <jack@xxxxxxxx> > SUSE Labs, CR