On Tue 06-06-23 10:19:56, Darrick J. Wong wrote: > On Thu, May 25, 2023 at 04:14:30PM +0200, Jan Kara wrote: > > On Mon 22-05-23 16:42:00, Darrick J. Wong wrote: > > > How about this as an alternative patch? Kernel and userspace freeze > > > state are stored in s_writers; each type cannot block the other (though > > > you still can't have nested kernel or userspace freezes); and the freeze > > > is maintained until /both/ freeze types are dropped. > > > > > > AFAICT this should work for the two other usecases (quiescing pagefaults > > > for fsdax pmem pre-removal; and freezing fses during suspend) besides > > > online fsck for xfs. > > > > > > --D > > > > > > From: Darrick J. Wong <djwong@xxxxxxxxxx> > > > Subject: fs: distinguish between user initiated freeze and kernel initiated freeze > > > > > > Userspace can freeze a filesystem using the FIFREEZE ioctl or by > > > suspending the block device; this state persists until userspace thaws > > > the filesystem with the FITHAW ioctl or resuming the block device. > > > Since commit 18e9e5104fcd ("Introduce freeze_super and thaw_super for > > > the fsfreeze ioctl") we only allow the first freeze command to succeed. > > > > > > The kernel may decide that it is necessary to freeze a filesystem for > > > its own internal purposes, such as suspends in progress, filesystem fsck > > > activities, or quiescing a device prior to removal. Userspace thaw > > > commands must never break a kernel freeze, and kernel thaw commands > > > shouldn't undo userspace's freeze command. > > > > > > Introduce a couple of freeze holder flags and wire it into the > > > sb_writers state. One kernel and one userspace freeze are allowed to > > > coexist at the same time; the filesystem will not thaw until both are > > > lifted. > > > > > > Inspired-by: Luis Chamberlain <mcgrof@xxxxxxxxxx> > > > Signed-off-by: Darrick J. Wong <djwong@xxxxxxxxxx> > > > > Yes, this is exactly how I'd imagine it. Thanks for writing the patch! > > > > I'd just note that this would need rebasing on top of Luis' patches 1 and > > 2. Also: > > I started doing that, but I noticed that after patch 1, freeze_super no > longer leaves s_active elevated if the freeze is successful. The > callers drop the s_active ref that they themselves obtained, which > means that we've now changed that behavior, right? ioctl_fsfreeze now > does: > > if (!get_active_super(sb->s_bdev)) > return -ENOTTY; > > (Increase ref) > > /* Freeze */ > if (sb->s_op->freeze_super) > ret = sb->s_op->freeze_super(sb); > ret = freeze_super(sb); > > (Not sure why we can do both here?) > > deactivate_locked_super(sb); > > (Decrease ref; net change to s_active is zero) > > return ret; > > Luis hasn't responded to my question, so I stopped. Right. I kind of like how he's moved the locking out of freeze_super() / thaw_super() but I agree this semantic change is problematic and needs much more thought - e.g. with Luis' version the fs could be unmounted while frozen which is going to spectacularly deadlock. So yeah let's just ignore patch 1 for now. Longer term I believe if the sb is frozen by userspace, we should refuse to unmount it but that's a separate discussion for some other time. > > BTW, when reading this code, I've spotted attached cleanup opportunity but > > I'll queue that separately so that is JFYI. > > > > > +#define FREEZE_HOLDER_USERSPACE (1U << 1) /* userspace froze fs */ > > > +#define FREEZE_HOLDER_KERNEL (1U << 2) /* kernel froze fs */ > > > > Why not start from 1U << 0? And bonus points for using BIT() macro :). > > I didn't think filesystem code was supposed to be using stuff from > vdso.h... Hum, so BIT() macro is quite widely used in include/linux/ (from generic headers e.g. in trace.h). include/linux/bits.h seems to be the right include to use but I'm pretty sure include/linux/fs.h already gets this header through something. Honza -- Jan Kara <jack@xxxxxxxx> SUSE Labs, CR