This test case is not intended to pass with current btrfs code. The problem of "destructive RMW" is affecting btrfs from the very beginning of btrfs RAID56. The "destructive RMW" happens by: - Do some sub-stripe writes into data stripe 1 - Corrupt above written data stripe 1 - Do some sub-stripe writes into data stripe 2 of the same full stripe We need to do RMW to calculate a new P/Q stripe. However btrfs RAID56 code has no way to determine on-disk data stripes are correct or not. It just read the corrupted data stripe 1, and use them to calculate new P stripe. Since data stripe 1 is already corrupted, the new P stripe (with correct data stripe 2) can not recover the original data stripe 1, making data stripe 1 unable to recover. The test case itself will intentionally create such "destructive RMW" to check if btrfs can handle it. Unfortunately current btrfs code can not handle it at all, thus the test case is going to always fail. Thus the test case is here mostly to leave a warning sign for now, and that's why it's only in "raid" and "repair" groups. It will only moved to "auto" and "quick" groups after upstream kernel has a way to solve it. Signed-off-by: Qu Wenruo <wqu@xxxxxxxx> --- tests/btrfs/272 | 62 +++++++++++++++++++++++++++++++++++++++++++++ tests/btrfs/272.out | 5 ++++ 2 files changed, 67 insertions(+) create mode 100755 tests/btrfs/272 create mode 100644 tests/btrfs/272.out diff --git a/tests/btrfs/272 b/tests/btrfs/272 new file mode 100755 index 00000000..d4aa7737 --- /dev/null +++ b/tests/btrfs/272 @@ -0,0 +1,62 @@ +#! /bin/bash +# SPDX-License-Identifier: GPL-2.0 +# Copyright (C) 2022 SUSE Linux Products GmbH. All Rights Reserved. +# +# FS QA Test 272 +# +# Test if btrfs RAID5 can detects corrupted data stripes before doing RMW. +# +# If not properly detected, this can make btrfs RAID56 to use corrupted data +# stripes to calculate new P/Q stripes, and result unrecoverable corruption. +# +# Unfortunately such error detection is not implemented in btrfs RAID56 yet. +# +. ./common/preamble +_begin_fstest quick raid repair + +# Import common functions. +. ./common/btrfs +. ./common/filter + +# real QA test starts here +_supported_fs btrfs +_require_scratch_dev_pool 3 +_scratch_dev_pool_get 3 + +# mkfs using RAID5 will cause WARNING message, needs to redirect it. +_scratch_pool_mkfs "-d raid5 -m raid5" >> $seqres.full 2>&1 + +_scratch_mount + +# Btrfs RAID all uses 64K as stripe length, so this should +# fill data stripe 1. +$XFS_IO_PROG -f -c "pwrite -S 0x11 0 64K" $SCRATCH_MNT/file1 -c sync > /dev/null + +echo "=== MD5 before corruption ===" +_md5_checksum $SCRATCH_MNT/file1 + +logical=$(_btrfs_get_first_logical $SCRATCH_MNT/file1) +physical=$(_btrfs_get_physical $logical 1) +dev=$(_btrfs_get_device_path $logical 1) + +echo "=== Data stripe 1, logical=$logical dev=$dev physical=$physical ===" >> $seqres.full +_scratch_unmount + +# Corrupt data stripe 1 +$XFS_IO_PROG -c "pwrite -S 0xff $physical 64K" $dev > /dev/null + +_scratch_mount + +# Do a new write into data stripe 2, this write will trigger RMW, which will +# read data stripe 1 (already corrupted) to calculate P stripe. +$XFS_IO_PROG -f -c "pwrite -S 0x22 0 64K" $SCRATCH_MNT/file2 -c sync > /dev/null + +# Check if file1 (aka, data stripe 1) can still be recovered +echo "=== MD5 after corruption and RMW ===" +_md5_checksum $SCRATCH_MNT/file1 + +_scratch_unmount + +# success, all done +status=0 +exit diff --git a/tests/btrfs/272.out b/tests/btrfs/272.out new file mode 100644 index 00000000..b7bb02f4 --- /dev/null +++ b/tests/btrfs/272.out @@ -0,0 +1,5 @@ +QA output created by 272 +=== MD5 before corruption === +876f4f724f70c185824f120574658786 +=== MD5 after corruption and RMW === +876f4f724f70c185824f120574658786 -- 2.37.2