Re: [PATCH v2 2/2] fs: handle freezing from multiple devices

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sat 04-11-23 15:00:13, Christian Brauner wrote:
> Before [1] freezing a filesystems through the block layer only worked
> for the main block device as the owning superblock of additional block
> devices could not be found. Any filesystem that made use of multiple
> block devices would only be freezable via it's main block device.
> 
> For example, consider xfs over device mapper with /dev/dm-0 as main
> block device and /dev/dm-1 as external log device. Two freeze requests
> before [1]:
> 
> (1) dmsetup suspend /dev/dm-0 on the main block device
> 
>     bdev_freeze(dm-0)
>     -> dm-0->bd_fsfreeze_count++
>     -> freeze_super(xfs-sb)
> 
>     The owning superblock is found and the filesystem gets frozen.
>     Returns 0.
> 
> (2) dmsetup suspend /dev/dm-1 on the log device
> 
>     bdev_freeze(dm-1)
>     -> dm-1->bd_fsfreeze_count++
> 
>     The owning superblock isn't found and only the block device freeze
>     count is incremented. Returns 0.
> 
> Two freeze requests after [1]:
> 
> (1') dmsetup suspend /dev/dm-0 on the main block device
> 
>     bdev_freeze(dm-0)
>     -> dm-0->bd_fsfreeze_count++
>     -> freeze_super(xfs-sb)
> 
>     The owning superblock is found and the filesystem gets frozen.
>     Returns 0.
> 
> (2') dmsetup suspend /dev/dm-1 on the log device
> 
>     bdev_freeze(dm-0)
>     -> dm-0->bd_fsfreeze_count++
>     -> freeze_super(xfs-sb)
> 
>     The owning superblock is found and the filesystem gets frozen.
>     Returns -EBUSY.
> 
> When (2') is called we initiate a freeze from another block device of
> the same superblock. So we increment the bd_fsfreeze_count for that
> additional block device. But we now also find the owning superblock for
> additional block devices and call freeze_super() again which reports
> -EBUSY.
> 
> This can be reproduced through xfstests via:
> 
>     mkfs.xfs -f -m crc=1,reflink=1,rmapbt=1, -i sparse=1 -lsize=1g,logdev=/dev/nvme1n1p4 /dev/nvme1n1p3
>     mkfs.xfs -f -m crc=1,reflink=1,rmapbt=1, -i sparse=1 -lsize=1g,logdev=/dev/nvme1n1p6 /dev/nvme1n1p5
> 
>     FSTYP=xfs
>     export TEST_DEV=/dev/nvme1n1p3
>     export TEST_DIR=/mnt/test
>     export TEST_LOGDEV=/dev/nvme1n1p4
>     export SCRATCH_DEV=/dev/nvme1n1p5
>     export SCRATCH_MNT=/mnt/scratch
>     export SCRATCH_LOGDEV=/dev/nvme1n1p6
>     export USE_EXTERNAL=yes
> 
>     sudo ./check generic/311
> 
> Current semantics allow two concurrent freezers: one initiated from
> userspace via FREEZE_HOLDER_USERSPACE and one initiated from the kernel
> via FREEZE_HOLDER_KERNEL. If there are multiple concurrent freeze
> requests from either FREEZE_HOLDER_USERSPACE or FREEZE_HOLDER_KERNEL
> -EBUSY is returned.
> 
> We need to preserve these semantics because as they are uapi via
> FIFREEZE and FITHAW ioctl()s. IOW, freezes don't nest for FIFREEZE and
> FITHAW. Other kernels consumers rely on non-nesting freezes as well.
> 
> With freezes initiated from the block layer freezes need to nest if the
> same superblock is frozen via multiple devices. So we need to start
> counting the number of freeze requests.
> 
> If FREEZE_HOLDER_BDEV is passed alongside FREEZE_HOLDER_KERNEL or
> FREEZE_HOLDER_USERSPACE we allow the caller to nest freeze calls.

FREEZE_HOLDER_BDEV should be FREEZE_MAY_NEST I guess.

> To accommodate the old semantics we split the freeze counter into two
> counting kernel initiated and userspace initiated freezes separately. We
> can then also stop recording FREEZE_HOLDER_* in struct sb_writers.
> 
> We also simplify freezing by making all concurrent freezers share a
> single active superblock reference count instead of having separate
> references for kernel and userspace. I don't see why we would need two
> active reference counts. Neither FREEZE_HOLDER_KERNEL nor
> FREEZE_HOLDER_USERSPACE can put the active reference as long as they are
> concurrent freezers anwyay. That was already true before we allowed
> nesting freezes.
> 
> Survives various fstests runs with different options including the
> reproducer, online scrub, and online repair, fsfreze, and so on. Also
> survives blktests.
> 
> Reported-by: Chandan Babu R <chandanbabu@xxxxxxxxxx>
> Link: https://lore.kernel.org/linux-block/87bkccnwxc.fsf@debian-BULLSEYE-live-builder-AMD64
> Fixes: [1]: bfac4176f2c4 ("bdev: implement freeze and thaw holder operations") # no backport needed
> Signed-off-by: Christian Brauner <brauner@xxxxxxxxxx>

Just one more typo fix below. Feel free to add:

Reviewed-by: Jan Kara <jack@xxxxxxx>

> @@ -1930,6 +2008,13 @@ static int wait_for_partially_frozen(struct super_block *sb)
>   * userspace can both hold a filesystem frozen.  The filesystem remains frozen
>   * until there are no kernel or userspace freezes in effect.
>   *
> + * A filesystem may hold multiple devices and thus a filesystems may be
> + * frozen through the block layer via multiple block devices. In this
> + * case the request is marked as being allowed to nest passig
							  ^^ passing

								Honza
-- 
Jan Kara <jack@xxxxxxxx>
SUSE Labs, CR




[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [NTFS 3]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [NTFS 3]     [Samba]     [Device Mapper]     [CEPH Development]

  Powered by Linux