Re: [PATCH] fix writing to the filesystem after unmount

Christian Brauner <brauner@xxxxxxxxxx> · Wed, 6 Sep 2023 18:19:01 +0200

On Wed, Sep 06, 2023 at 06:01:06PM +0200, Mikulas Patocka wrote:
> 
> 
> On Wed, 6 Sep 2023, Christian Brauner wrote:
> 
> > > > IOW, you'd also hang on any umount of a bind-mount. IOW, every
> > > > single container making use of this filesystems via bind-mounts would
> > > > hang on umount and shutdown.
> > > 
> > > bind-mount doesn't modify "s->s_writers.frozen", so the patch does nothing 
> > > in this case. I tried unmounting bind-mounts and there was no deadlock.
> > 
> > With your patch what happens if you do the following?
> > 
> > #!/bin/sh -ex
> > modprobe brd rd_size=4194304
> > vgcreate vg /dev/ram0
> > lvcreate -L 16M -n lv vg
> > mkfs.ext4 /dev/vg/lv
> > 
> > mount -t ext4 /dev/vg/lv /mnt/test
> > mount --bind /mnt/test /opt
> > mount --make-private /opt
> > 
> > dmsetup suspend /dev/vg/lv
> > (sleep 1; dmsetup resume /dev/vg/lv) &
> > 
> > umount /opt # I'd expect this to hang
> > 
> > md5sum /dev/vg/lv
> > md5sum /dev/vg/lv
> > dmsetup remove_all
> > rmmod brd
> 
> "umount /opt" doesn't hang. It waits one second (until dmsetup resume is 
> called) and then proceeds.

So unless I'm really misreading the code - entirely possible - the
umount of the bind-mount now waits until the filesystem is resumed with
your patch. And if that's the case that's a bug.

If at all, then only the last umount, the one that destroys the
superblock, should wait for the filesystem to become unfrozen.

A bind-mount shouldn't as there are still active mounts of the
filesystem (e.g., /mnt/test).

So you should see this with (unless I really misread things):

#!/bin/sh -ex
modprobe brd rd_size=4194304
vgcreate vg /dev/ram0
lvcreate -L 16M -n lv vg
mkfs.ext4 /dev/vg/lv

mount -t ext4 /dev/vg/lv /mnt/test
mount --bind /mnt/test /opt
mount --make-private /opt

dmsetup suspend /dev/vg/lv

umount /opt # This will hang with your patch?

> 
> Then, it fails with "rmmod: ERROR: Module brd is in use" because the 
> script didn't unmount /mnt/test.
> 
> > > BTW. what do you think that unmount of a frozen filesystem should properly 
> > > do? Fail with -EBUSY? Or, unfreeze the filesystem and unmount it? Or 
> > > something else?
> > 
> > In my opinion we should refuse to unmount frozen filesystems and log an
> > error that the filesystem is frozen. Waiting forever isn't a good idea
> > in my opinion.
> 
> But lvm may freeze filesystems anytime - so we'd get randomly returned 
> errors then.

So? Or you might hang at anytime.

> 
> > But this is a significant uapi change afaict so this would need to be
> > hidden behind a config option, a sysctl, or it would have to be a new
> > flag to umount2() MNT_UNFROZEN which would allow an administrator to use
> > this flag to not unmount a frozen filesystems.
> 
> The kernel currently distinguishes between kernel-initiated freeze (that 
> is used by the XFS scrub) and userspace-initiated freeze (that is used by 
> the FIFREEZE ioctl and by device-mapper initiated freeze through 
> freeze_bdev).

Yes, I'm aware.

> 
> Perhaps we could distinguish between FIFREEZE-initiated freezes and 
> device-mapper initiated freezes as well. And we could change the logic to 
> return -EBUSY if the freeze was initiated by FIFREEZE and to wait for 
> unfreeze if it was initiated by the device-mapper.

For device mapper initiated freezes you can unfreeze independent of any
filesystem mountpoint via dm ioctls.