Re: [PATCH] btrfs/266: test case enhancement to cover more possible failures

[Date Prev] [Date Next] [Thread Prev] [Thread Next] [Date Index] [Thread Index]





On 07/06/2023 15:39, Qu Wenruo wrote:


On 2023/6/7 09:52, Qu Wenruo wrote:


On 2023/6/7 08:13, Anand Jain wrote:


  It is failing on sectorsize 64k.

That's what I'm investigating.

And the failure is random, if you ran more times it would pass (the
failure rate is 1/3~1/5 in my case).

And to my surprise, this is in fact not a bug in btrfs, but more likely
a bug in drop_caches.

I added several trace printk() for __btrfs_submit_bio(),
btrfs_check_read_bio(), and __end_bio_extent_readpage() to grasp the
repair work flow.
 > It turns out, when the test failed, at least one mirror is not read from
disk, but directly using page cache. > Thus no wonder the data would be repaired, just because that mirror is
not properly read at all.


Failure is inconsistent on my system too. Does the test fails depending on which mirror becomes the latest_bdev for the metadata?

Thanks, Anand


I'll start a new thread on this particular problem.

Thanks,
Qu

Thanks,
Qu

---------
btrfs/266 2s ... - output mismatch (see
/xfstests-dev/results//btrfs/266.out.bad)
     --- tests/btrfs/266.out    2023-06-06 20:02:48.900915702 -0400
     +++ /xfstests-dev/results//btrfs/266.out.bad    2023-06-06
20:02:56.665554779 -0400
     @@ -19,11 +19,11 @@
        Physical offset + 64K:
      XXXXXXXX:  aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa
................
        Physical offset + 128K:
     -XXXXXXXX:  aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa
................
     +XXXXXXXX:  bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb
................
-------

Thanks, Anand


On 06/06/2023 18:30, Qu Wenruo wrote:
[BACKGROUND]
Recently I'm debugging a random failure with btrfs/266 with larger page
sizes (64K page size, with either 64K sector size or 4K sector size).

During the tests, I found the test case itself can be further enhanced
to make better coverage and easier debugging.

[ENHANCEMENT]

- Ensure every 64K block only has one good mirror
   The initial layout is not pushing hard enough, some ranges have 2
good
   mirrors while some only has one.

- Simplify the golden output
   The current golden output contains 512 bytes output for the beginning
   of each mirror.

   The 512 bytes output itself is both duplicating and not comprehensive
   enough (see the next output).

   This patch would remove the duplication part by only output one
single
   line for 16 bytes.

- Add extra output for all the 3 64K blocks
   Each 64K of the involved file now has only one good mirror, and they
   are all on different devices.
   Thus only checking the beginning of the first 64K block is not good
   enough.

   This patch would enhance this by output the first 16 bytes for all
the
   3 64K blocks on each device.

- Add a final safenet to catch unexpected corruption
   If we have some weird corruption after the first 16 bytes of each
   64K blocks, we can still detect them using "btrfs check
   --check-data-csum", which acts as offline scrub.

Signed-off-by: Qu Wenruo <wqu@xxxxxxxx>
---
  tests/btrfs/266     |  59 ++++++++++++++++++++----
  tests/btrfs/266.out | 109 ++++++++------------------------------------
  2 files changed, 68 insertions(+), 100 deletions(-)

diff --git a/tests/btrfs/266 b/tests/btrfs/266
index 42aff7c0..894c5c6e 100755
--- a/tests/btrfs/266
+++ b/tests/btrfs/266
@@ -25,7 +25,7 @@ _require_odirect
  _require_non_zoned_device "${SCRATCH_DEV}"
  _scratch_dev_pool_get 3
-# step 1, create a raid1 btrfs which contains one 128k file.
+# step 1, create a raid1 btrfs which contains one 192k file.
  echo "step 1......mkfs.btrfs"
  mkfs_opts="-d raid1c3 -b 1G"
@@ -33,7 +33,7 @@ _scratch_pool_mkfs $mkfs_opts >>$seqres.full 2>&1
  _scratch_mount
-$XFS_IO_PROG -f -d -c "pwrite -S 0xaa -b 256K 0 256K" \
+$XFS_IO_PROG -f -d -c "pwrite -S 0xaa -b 192K 0 192K" \
      "$SCRATCH_MNT/foobar" | \
      _filter_xfs_io_offset
@@ -56,6 +56,13 @@ devpath3=$(_btrfs_get_device_path ${logical} 3)
  _scratch_unmount
+# We corrupt the mirrors so that every 64K block only has one
+# good mirror. (X = corruption)
+#
+#        0    64K    128K    192K
+# Mirror 1    |XXXXXXXXXXXXXXX|    |
+# Mirror 2    |    |XXXXXXXXXXXXXXX|
+# Mirror 3    |XXXXXXX|    |XXXXXXX|
  $XFS_IO_PROG -d -c "pwrite -S 0xbd -b 64K $physical3 64K" \
      $devpath3 > /dev/null
@@ -65,7 +72,7 @@ $XFS_IO_PROG -d -c "pwrite -S 0xba -b 64K $physical1
128K" \
  $XFS_IO_PROG -d -c "pwrite -S 0xbb -b 64K $((physical2 + 65536))
128K" \
      $devpath2 > /dev/null
-$XFS_IO_PROG -d -c "pwrite -S 0xbc -b 64K $((physical3 + (2 *
65536))) 128K"  \
+$XFS_IO_PROG -d -c "pwrite -S 0xbc -b 64K $((physical3 + (2 *
65536))) 64K"  \
      $devpath3 > /dev/null
  _scratch_mount
@@ -73,19 +80,53 @@ _scratch_mount
  # step 3, 128k dio read (this read can repair bad copy)
  echo "step 3......repair the bad copy"
-_btrfs_buffered_read_on_mirror 0 3 "$SCRATCH_MNT/foobar" 0 256K
-_btrfs_buffered_read_on_mirror 1 3 "$SCRATCH_MNT/foobar" 0 256K
-_btrfs_buffered_read_on_mirror 2 3 "$SCRATCH_MNT/foobar" 0 256K
+_btrfs_buffered_read_on_mirror 0 3 "$SCRATCH_MNT/foobar" 0 192K
+_btrfs_buffered_read_on_mirror 1 3 "$SCRATCH_MNT/foobar" 0 192K
+_btrfs_buffered_read_on_mirror 2 3 "$SCRATCH_MNT/foobar" 0 192K
  _scratch_unmount
  echo "step 4......check if the repair worked"
-$XFS_IO_PROG -d -c "pread -v -b 512 $physical1 512" $devpath1 |\
+echo "Dev 1:"
+echo "  Physical offset + 0:"
+$XFS_IO_PROG -c "pread -qv $physical1 16" $devpath1 |\
      _filter_xfs_io_offset
-$XFS_IO_PROG -d -c "pread -v -b 512 $physical2 512" $devpath2 |\
+echo "  Physical offset + 64K:"
+$XFS_IO_PROG -c "pread -qv $((physical1 + 65536)) 16" $devpath1 |\
      _filter_xfs_io_offset
-$XFS_IO_PROG -d -c "pread -v -b 512 $physical3 512" $devpath3 |\
+echo "  Physical offset + 128K:"
+$XFS_IO_PROG -c "pread -qv $((physical1 + 131072)) 16" $devpath1 |\
      _filter_xfs_io_offset
+echo
+
+echo "Dev 2:"
+echo "  Physical offset + 0:"
+$XFS_IO_PROG -c "pread -qv $physical2 16" $devpath2 |\
+    _filter_xfs_io_offset
+echo "  Physical offset + 64K:"
+$XFS_IO_PROG -c "pread -qv $((physical2 + 65536)) 16" $devpath2 |\
+    _filter_xfs_io_offset
+echo "  Physical offset + 128K:"
+$XFS_IO_PROG -c "pread -qv $((physical2 + 131072)) 16" $devpath2 |\
+    _filter_xfs_io_offset
+echo
+
+echo "Dev 3:"
+echo "  Physical offset + 0:"
+$XFS_IO_PROG -c "pread -v $physical3 16" $devpath3 |\
+    _filter_xfs_io_offset
+echo "  Physical offset + 64K:"
+$XFS_IO_PROG -c "pread -v $((physical3 + 65536)) 16" $devpath3 |\
+    _filter_xfs_io_offset
+echo "  Physical offset + 128K:"
+$XFS_IO_PROG -c "pread -v $((physical3 + 131072)) 16" $devpath3 |\
+    _filter_xfs_io_offset
+
+# Final step to use btrfs check to verify the csum of all mirrors.
+$BTRFS_UTIL_PROG check --check-data-csum $SCRATCH_DEV >> $seqres.full
2>&1
+if [ $? -ne 0 ]; then
+    echo "btrfs check found some data csum mismatch"
+fi
  _scratch_dev_pool_put
  # success, all done
diff --git a/tests/btrfs/266.out b/tests/btrfs/266.out
index fcf2f5b8..305e9c83 100644
--- a/tests/btrfs/266.out
+++ b/tests/btrfs/266.out
@@ -1,109 +1,36 @@
  QA output created by 266
  step 1......mkfs.btrfs
-wrote 262144/262144 bytes
+wrote 196608/196608 bytes
  XXX Bytes, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
  step 2......corrupt file extent
  step 3......repair the bad copy
  step 4......check if the repair worked
+Dev 1:
+  Physical offset + 0:
  XXXXXXXX:  aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa
................
+  Physical offset + 64K:
  XXXXXXXX:  aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa
................
+  Physical offset + 128K:
  XXXXXXXX:  aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa
................
+
+Dev 2:
+  Physical offset + 0:
  XXXXXXXX:  aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa
................
+  Physical offset + 64K:
  XXXXXXXX:  aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa
................
+  Physical offset + 128K:
  XXXXXXXX:  aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa
................
+
+Dev 3:
+  Physical offset + 0:
  XXXXXXXX:  aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa
................
-XXXXXXXX:  aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa
................
-XXXXXXXX:  aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa
................
-XXXXXXXX:  aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa
................
-XXXXXXXX:  aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa
................
-XXXXXXXX:  aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa
................
-XXXXXXXX:  aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa
................
-XXXXXXXX:  aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa
................
-XXXXXXXX:  aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa
................
-XXXXXXXX:  aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa
................
-XXXXXXXX:  aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa
................
-XXXXXXXX:  aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa
................
-XXXXXXXX:  aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa
................
-XXXXXXXX:  aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa
................
-XXXXXXXX:  aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa
................
-XXXXXXXX:  aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa
................
-XXXXXXXX:  aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa
................
-XXXXXXXX:  aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa
................
-XXXXXXXX:  aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa
................
-XXXXXXXX:  aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa
................
-XXXXXXXX:  aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa
................
-XXXXXXXX:  aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa
................
-XXXXXXXX:  aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa
................
-XXXXXXXX:  aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa
................
-XXXXXXXX:  aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa
................
-XXXXXXXX:  aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa
................
-read 512/512 bytes
+read 16/16 bytes
  XXX Bytes, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+  Physical offset + 64K:
  XXXXXXXX:  aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa
................
-XXXXXXXX:  aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa
................
-XXXXXXXX:  aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa
................
-XXXXXXXX:  aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa
................
-XXXXXXXX:  aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa
................
-XXXXXXXX:  aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa
................
-XXXXXXXX:  aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa
................
-XXXXXXXX:  aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa
................
-XXXXXXXX:  aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa
................
-XXXXXXXX:  aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa
................
-XXXXXXXX:  aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa
................
-XXXXXXXX:  aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa
................
-XXXXXXXX:  aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa
................
-XXXXXXXX:  aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa
................
-XXXXXXXX:  aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa
................
-XXXXXXXX:  aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa
................
-XXXXXXXX:  aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa
................
-XXXXXXXX:  aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa
................
-XXXXXXXX:  aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa
................
-XXXXXXXX:  aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa
................
-XXXXXXXX:  aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa
................
-XXXXXXXX:  aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa
................
-XXXXXXXX:  aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa
................
-XXXXXXXX:  aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa
................
-XXXXXXXX:  aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa
................
-XXXXXXXX:  aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa
................
-XXXXXXXX:  aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa
................
-XXXXXXXX:  aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa
................
-XXXXXXXX:  aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa
................
-XXXXXXXX:  aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa
................
-XXXXXXXX:  aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa
................
-XXXXXXXX:  aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa
................
-read 512/512 bytes
+read 16/16 bytes
  XXX Bytes, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+  Physical offset + 128K:
  XXXXXXXX:  aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa
................
-XXXXXXXX:  aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa
................
-XXXXXXXX:  aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa
................
-XXXXXXXX:  aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa
................
-XXXXXXXX:  aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa
................
-XXXXXXXX:  aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa
................
-XXXXXXXX:  aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa
................
-XXXXXXXX:  aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa
................
-XXXXXXXX:  aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa
................
-XXXXXXXX:  aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa
................
-XXXXXXXX:  aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa
................
-XXXXXXXX:  aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa
................
-XXXXXXXX:  aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa
................
-XXXXXXXX:  aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa
................
-XXXXXXXX:  aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa
................
-XXXXXXXX:  aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa
................
-XXXXXXXX:  aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa
................
-XXXXXXXX:  aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa
................
-XXXXXXXX:  aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa
................
-XXXXXXXX:  aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa
................
-XXXXXXXX:  aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa
................
-XXXXXXXX:  aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa
................
-XXXXXXXX:  aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa
................
-XXXXXXXX:  aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa
................
-XXXXXXXX:  aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa
................
-XXXXXXXX:  aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa
................
-XXXXXXXX:  aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa
................
-XXXXXXXX:  aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa
................
-XXXXXXXX:  aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa
................
-XXXXXXXX:  aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa
................
-XXXXXXXX:  aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa
................
-XXXXXXXX:  aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa aa
................
-read 512/512 bytes
+read 16/16 bytes
  XXX Bytes, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)




[Index of Archives]     [Linux Filesystems Development]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux