Re: [PATCH 3/6] fs: distinguish between user initiated freeze and kernel initiated freeze

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon 22-05-23 16:42:00, Darrick J. Wong wrote:
> How about this as an alternative patch?  Kernel and userspace freeze
> state are stored in s_writers; each type cannot block the other (though
> you still can't have nested kernel or userspace freezes); and the freeze
> is maintained until /both/ freeze types are dropped.
> 
> AFAICT this should work for the two other usecases (quiescing pagefaults
> for fsdax pmem pre-removal; and freezing fses during suspend) besides
> online fsck for xfs.
> 
> --D
> 
> From: Darrick J. Wong <djwong@xxxxxxxxxx>
> Subject: fs: distinguish between user initiated freeze and kernel initiated freeze
> 
> Userspace can freeze a filesystem using the FIFREEZE ioctl or by
> suspending the block device; this state persists until userspace thaws
> the filesystem with the FITHAW ioctl or resuming the block device.
> Since commit 18e9e5104fcd ("Introduce freeze_super and thaw_super for
> the fsfreeze ioctl") we only allow the first freeze command to succeed.
> 
> The kernel may decide that it is necessary to freeze a filesystem for
> its own internal purposes, such as suspends in progress, filesystem fsck
> activities, or quiescing a device prior to removal.  Userspace thaw
> commands must never break a kernel freeze, and kernel thaw commands
> shouldn't undo userspace's freeze command.
> 
> Introduce a couple of freeze holder flags and wire it into the
> sb_writers state.  One kernel and one userspace freeze are allowed to
> coexist at the same time; the filesystem will not thaw until both are
> lifted.
> 
> Inspired-by: Luis Chamberlain <mcgrof@xxxxxxxxxx>
> Signed-off-by: Darrick J. Wong <djwong@xxxxxxxxxx>

Yes, this is exactly how I'd imagine it. Thanks for writing the patch!

I'd just note that this would need rebasing on top of Luis' patches 1 and
2. Also:

> +	if (sbw->frozen == SB_FREEZE_COMPLETE) {
> +		switch (who) {
> +		case FREEZE_HOLDER_KERNEL:
> +			if (sbw->freeze_holders & FREEZE_HOLDER_KERNEL) {
> +				/*
> +				 * Kernel freeze already in effect; caller can
> +				 * try again.
> +				 */
> +				deactivate_locked_super(sb);
> +				return -EBUSY;
> +			}
> +			if (sbw->freeze_holders & FREEZE_HOLDER_USERSPACE) {
> +				/*
> +				 * Share the freeze state with the userspace
> +				 * freeze already in effect.
> +				 */
> +				sbw->freeze_holders |= who;
> +				deactivate_locked_super(sb);
> +				return 0;
> +			}
> +			break;
> +		case FREEZE_HOLDER_USERSPACE:
> +			if (sbw->freeze_holders & FREEZE_HOLDER_USERSPACE) {
> +				/*
> +				 * Userspace freeze already in effect; tell
> +				 * the caller we're busy.
> +				 */
> +				deactivate_locked_super(sb);
> +				return -EBUSY;
> +			}
> +			if (sbw->freeze_holders & FREEZE_HOLDER_KERNEL) {
> +				/*
> +				 * Share the freeze state with the kernel
> +				 * freeze already in effect.
> +				 */
> +				sbw->freeze_holders |= who;
> +				deactivate_locked_super(sb);
> +				return 0;
> +			}
> +			break;
> +		default:
> +			BUG();
> +			deactivate_locked_super(sb);
> +			return -EINVAL;
> +		}
> +	}

Can't this be simplified to:

	BUG_ON(who & ~(FREEZE_HOLDER_USERSPACE | FREEZE_HOLDER_KERNEL));
	BUG_ON(!(!(who & FREEZE_HOLDER_USERSPACE) ^
	       !(who & FREEZE_HOLDER_KERNEL)));
retry:
	if (sb->s_writers.freeze_holders & who)
		return -EBUSY;
	/* Already frozen by someone else? */
	if (sb->s_writers.freeze_holders & ~who) {
		sb->s_writers.freeze_holders |= who;
		return 0;
	}

Now the only remaining issue with the code is that the two different
holders can be attempting to freeze the filesystem at once and in that case
one of them has to wait for the other one instead of returning -EBUSY as
would happen currently. This can happen because we temporarily drop
s_umount in freeze_super() due to lock ordering issues. I think we could
do something like:

	if (!sb_unfrozen(sb)) {
		up_write(&sb->s_umount);
		wait_var_event(&sb->s_writers.frozen,
			       sb_unfrozen(sb) || sb_frozen(sb));
		down_write(&sb->s_umount);
		goto retry;
	}

and then sprinkle wake_up_var(&sb->s_writers.frozen) at appropriate places
in freeze_super().

BTW, when reading this code, I've spotted attached cleanup opportunity but
I'll queue that separately so that is JFYI.

> +#define FREEZE_HOLDER_USERSPACE	(1U << 1)	/* userspace froze fs */
> +#define FREEZE_HOLDER_KERNEL	(1U << 2)	/* kernel froze fs */

Why not start from 1U << 0? And bonus points for using BIT() macro :).

								Honza
-- 
Jan Kara <jack@xxxxxxxx>
SUSE Labs, CR
>From 9fce35f21f9a62470e764463c84373fb013108fd Mon Sep 17 00:00:00 2001
From: Jan Kara <jack@xxxxxxx>
Date: Thu, 25 May 2023 15:56:19 +0200
Subject: [PATCH] fs: Drop wait_unfrozen wait queue

wait_unfrozen waitqueue is used only in quota code to wait for
filesystem to become unfrozen. In that place we can just use
sb_start_write() - sb_end_write() pair to achieve the same. So just
remove the waitqueue.

Signed-off-by: Jan Kara <jack@xxxxxxx>
---
 fs/quota/quota.c   | 5 +++--
 fs/super.c         | 4 ----
 include/linux/fs.h | 1 -
 3 files changed, 3 insertions(+), 7 deletions(-)

diff --git a/fs/quota/quota.c b/fs/quota/quota.c
index 052f143e2e0e..0e41fb84060f 100644
--- a/fs/quota/quota.c
+++ b/fs/quota/quota.c
@@ -895,8 +895,9 @@ static struct super_block *quotactl_block(const char __user *special, int cmd)
 			up_write(&sb->s_umount);
 		else
 			up_read(&sb->s_umount);
-		wait_event(sb->s_writers.wait_unfrozen,
-			   sb->s_writers.frozen == SB_UNFROZEN);
+		/* Wait for sb to unfreeze */
+		sb_start_write(sb);
+		sb_end_write(sb);
 		put_super(sb);
 		goto retry;
 	}
diff --git a/fs/super.c b/fs/super.c
index 34afe411cf2b..6283cea67280 100644
--- a/fs/super.c
+++ b/fs/super.c
@@ -236,7 +236,6 @@ static struct super_block *alloc_super(struct file_system_type *type, int flags,
 					&type->s_writers_key[i]))
 			goto fail;
 	}
-	init_waitqueue_head(&s->s_writers.wait_unfrozen);
 	s->s_bdi = &noop_backing_dev_info;
 	s->s_flags = flags;
 	if (s->s_user_ns != &init_user_ns)
@@ -1706,7 +1705,6 @@ int freeze_super(struct super_block *sb)
 	if (ret) {
 		sb->s_writers.frozen = SB_UNFROZEN;
 		sb_freeze_unlock(sb, SB_FREEZE_PAGEFAULT);
-		wake_up(&sb->s_writers.wait_unfrozen);
 		deactivate_locked_super(sb);
 		return ret;
 	}
@@ -1722,7 +1720,6 @@ int freeze_super(struct super_block *sb)
 				"VFS:Filesystem freeze failed\n");
 			sb->s_writers.frozen = SB_UNFROZEN;
 			sb_freeze_unlock(sb, SB_FREEZE_FS);
-			wake_up(&sb->s_writers.wait_unfrozen);
 			deactivate_locked_super(sb);
 			return ret;
 		}
@@ -1768,7 +1765,6 @@ static int thaw_super_locked(struct super_block *sb)
 	sb->s_writers.frozen = SB_UNFROZEN;
 	sb_freeze_unlock(sb, SB_FREEZE_FS);
 out:
-	wake_up(&sb->s_writers.wait_unfrozen);
 	deactivate_locked_super(sb);
 	return 0;
 }
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 21a981680856..3b65a6194485 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -1146,7 +1146,6 @@ enum {
 
 struct sb_writers {
 	int				frozen;		/* Is sb frozen? */
-	wait_queue_head_t		wait_unfrozen;	/* wait for thaw */
 	struct percpu_rw_semaphore	rw_sem[SB_FREEZE_LEVELS];
 };
 
-- 
2.35.3


[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [NTFS 3]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [NTFS 3]     [Samba]     [Device Mapper]     [CEPH Development]

  Powered by Linux