From: Christian Brauner <brauner@xxxxxxxxxx> commit 2ae4db5647d807efb6a87c09efaa6d1db9c905d7 upstream. The block device may have been frozen before it was claimed by a filesystem. Concurrently another process might try to mount that frozen block device and has temporarily claimed the block device for that purpose causing a concurrent fs_bdev_thaw() to end up here. The mounter is already about to abort mounting because they still saw an elevanted bdev->bd_fsfreeze_count so get_bdev_super() will return NULL in that case. For example, P1 calls dm_suspend() which calls into bdev_freeze() before the block device has been claimed by the filesystem. This brings bdev->bd_fsfreeze_count to 1 and no call into fs_bdev_freeze() is required. Now P2 tries to mount that frozen block device. It claims it and checks bdev->bd_fsfreeze_count. As it's elevated it aborts mounting. In the meantime P3 called dm_resume(). P3 sees that the block device is already claimed by a filesystem and calls into fs_bdev_thaw(). P3 takes a passive reference and realizes that the filesystem isn't ready yet. P3 puts itself to sleep to wait for the filesystem to become ready. P2 now puts the last active reference to the filesystem and marks it as dying. P3 gets woken, sees that the filesystem is dying and get_bdev_super() fails. Fixes: 49ef8832fb1a ("bdev: implement freeze and thaw holder operations") Cc: <stable@xxxxxxxxxxxxxxx> Reported-by: Theodore Ts'o <tytso@xxxxxxx> Link: https://lore.kernel.org/r/20240611085210.GA1838544@xxxxxxx Link: https://lore.kernel.org/r/20240613-lackmantel-einsehen-90f0d727358d@brauner Reviewed-by: Darrick J. Wong <djwong@xxxxxxxxxx> Signed-off-by: Christian Brauner <brauner@xxxxxxxxxx> Signed-off-by: Greg Kroah-Hartman <gregkh@xxxxxxxxxxxxxxxxxxx> --- fs/super.c | 11 ++++++++++- 1 file changed, 10 insertions(+), 1 deletion(-) --- a/fs/super.c +++ b/fs/super.c @@ -1501,8 +1501,17 @@ static int fs_bdev_thaw(struct block_dev lockdep_assert_held(&bdev->bd_fsfreeze_mutex); + /* + * The block device may have been frozen before it was claimed by a + * filesystem. Concurrently another process might try to mount that + * frozen block device and has temporarily claimed the block device for + * that purpose causing a concurrent fs_bdev_thaw() to end up here. The + * mounter is already about to abort mounting because they still saw an + * elevanted bdev->bd_fsfreeze_count so get_bdev_super() will return + * NULL in that case. + */ sb = get_bdev_super(bdev); - if (WARN_ON_ONCE(!sb)) + if (!sb) return -EINVAL; if (sb->s_op->thaw_super) Patches currently in stable-queue which might be from brauner@xxxxxxxxxx are queue-6.9/selftests-harness-fix-tests-timeout-and-race-condition.patch queue-6.9/mm-optimize-the-redundant-loop-of-mm_update_owner_next.patch queue-6.9/filelock-remove-locks-reliably-when-fcntl-close-race-is-detected.patch queue-6.9/fsnotify-do-not-generate-events-for-o_path-file-descriptors.patch queue-6.9/fs-don-t-misleadingly-warn-during-thaw-operations.patch