Re: [PATCH] generic: check if one fs can detect damage at/after fs thaw

Qu Wenruo <quwenruo.btrfs@xxxxxxx> · Thu, 20 Oct 2022 06:52:36 +0800

On 2022/10/19 23:36, Darrick J. Wong wrote:
On Wed, Oct 19, 2022 at 01:29:55PM +0800, Qu Wenruo wrote:
[BACKGROUND]
There is bug report from btrfs mailing list that, hiberation can allow

"hibernation".

one to modify the frozen filesystem unexpectedly (using another OS).
(https://lore.kernel.org/linux-btrfs/83bf3b4b-7f4c-387a-b286-9251e3991e34@xxxxxxxxxxxx/)

Later btrfs adds the check to make sure the fs is not changed
unexpectedly, to prevent corruption from happening.

[TESTCASE]
Here the new test case will create a basic filesystem, fill it with
something by using fsstress, then sync the fs, and finally freeze the fs.

Then corrupt the whole fs by overwriting the block device with 0xcd
(default seed from xfs_io pwrite command).

Finally we thaw the fs, and try if we can create a new file.

for EXT4, it will detect the corruption at touch time, causing -EUCLEAN.

Heh, yikes.  That's pretty scary for ext4 since it still uses buffer
heads from the block device to read/store metadata and older kernels are
known to have crashing problems if (say) the feature bits in the primary
superblock get changed.

I wonder if this should force errors=remount-ro for ext4 since
errors=continue is dangerous and erorrs=panic will crash the test
machine.

For Btrfs, it will detect the corruption at thaw time, marking the
fs RO immediately, and later touch will return -EROFS.

What /does/ btrfs check, specifically?

- Read sb without using cache

- The same mount time sanity checks on the superblock
  Which already implies an fsid check.

- Extra generation check
  To make sure no one has touched out cake.

 Reading this makes me wonder if
xfs shouldn't re-read its primary super on thaw to check that nobody ran
us over with a backhoe, though that wouldn't help us in the hibernation
case.  (Or does it?  Is userspace/systemd finally smart enough to freeze
filesystems?)

I doubt if userspace/systemd is that smart, because the error report is
running not-that-old distro.

Especially for hibernation there is really no way for anyone to know if
our cakes are touched.


For XFS, it will detect the corruption at touch time, return -EUCLEAN.
(Without the cache drop, XFS seems to be very happy using the cache info
to do the work without any error though.)

Yep.

Signed-off-by: Qu Wenruo <wqu@xxxxxxxx>
---
  tests/generic/702     | 61 +++++++++++++++++++++++++++++++++++++++++++
  tests/generic/702.out |  2 ++
  2 files changed, 63 insertions(+)
  create mode 100755 tests/generic/702
  create mode 100644 tests/generic/702.out

diff --git a/tests/generic/702 b/tests/generic/702
new file mode 100755
index 00000000..fc3624e1
--- /dev/null
+++ b/tests/generic/702
@@ -0,0 +1,61 @@
+#! /bin/bash
+# SPDX-License-Identifier: GPL-2.0
+# Copyright (C) 2022 SUSE Linux Products GmbH. All Rights Reserved.
+#
+# FS QA Test 702
+#
+# Test if the filesystem can detect the underlying disk has changed at
+# thaw time.
+#
+. ./common/preamble
+. ./common/filter
+_begin_fstest freeze quick
+
+# real QA test starts here
+
+_supported_fs generic
+_fixed_by_kernel_commit a05d3c915314 \
+	"btrfs: check superblock to ensure the fs was not modified at thaw time"

Hmmm, it's not very useful for a test failure on (say) xfs spitting
out a message about how this "may" get fixed with a btrfs patch.  How
about:

$FSTYP = btrfs && _fixed_by_kernel_commit a05d3c915314 \
	"btrfs: check superbloc..."

That sounds pretty good.


+
+# We will corrupt the device completely, thus should not check it after the test.
+_require_scratch_nocheck
+_require_freeze
+
+# Limit the fs to 512M so we won't waste too much time screwing it up later.
+_scratch_mkfs_sized $((512 * 1024 * 1024)) >> $seqres.full 2>&1
+_scratch_mount
+
+# Populate the fs with something.
+$FSSTRESS_PROG -n 500 -d $SCRATCH_MNT >> $seqres.full
+
+# Sync to make sure no dirty journal
+sync
+
+# Drop all cache, so later write will need to read from disk, increasing
+# the chance of detecting the corruption.
+echo 3 > /proc/sys/vm/drop_caches
+
+$XFS_IO_PROG -x -c "freeze" $SCRATCH_MNT
+
+# Now screw up the block device
+$XFS_IO_PROG -f -c "pwrite 0 512M" -c sync $SCRATCH_DEV >> $seqres.full

directio and a larger buffer size to speed this up? e.g.

$XFS_IO_PROG -d -c 'pwrite -b 1m 0 512M' -c sync $SCRATCH_DEV

I guess no need for directio especially we're doing a sync after the write.
Although larger blocksize may only help a little considering by default
it's already buffered write.


+
+# Thaw the fs, it may or may not report error, we will check it manually later.
+$XFS_IO_PROG -x -c "thaw" $SCRATCH_MNT

I'm a little surprised you don't check for btrfs returning an error
here...?

Great you have asked!

This is the special pitfall related to thaw error handling.

If we return an error for .unfreeze_fs hook, the VFS treats it as we
failed to thaw the fs, and will still consider the fs frozen.

Thus for now, btrfs only output error message into dmesg during thaw,
but always return 0 to workaround it.

We may want a better way for .unfreeze_fs hook to distinguish between
"something really went wrong, but please consider it unfreezed" and
"nope, please keep it frozen".

Thanks,
Qu


+# If the fs detects something wrong, it should trigger error now.
+# We don't use the error message as golden output, as btrfs and ext4 use
+# different error number for different reasons.
+# (btrfs detects the change immediately at thaw time and mark the fs RO, thus
+#  touch returns -EROFS, while ext4 detects the change at journal write time,
+#  returning -EUCLEAN).
+touch $SCRATCH_MNT/foobar >>$seqres.full 2>&1
+if [ $? -eq 0 ]; then
+	echo "Failed to detect corrupted fs"
+else
+	echo "Detected corrupted fs (expected)"
+fi

But otherwise this test looks reasonable so far.

--D

+
+# success, all done
+status=0
+exit
diff --git a/tests/generic/702.out b/tests/generic/702.out
new file mode 100644
index 00000000..c29311ff
--- /dev/null
+++ b/tests/generic/702.out
@@ -0,0 +1,2 @@
+QA output created by 702
+Detected corrupted fs (expected)
--
2.38.0