On Tue, Aug 27, 2024 at 1:14 AM Qu Wenruo <wqu@xxxxxxxx> wrote: > > [BUG] > There is a bug report that, KASAN get triggered when: > > - A read bio needs to be split > This can happen for profiles with stripes, including > RAID0/RAID10/RAID5/RAID6. > > - An error happens before submitting the new split bio > This includes: > * chunk map lookup failure > * data csum lookup failure > > Then during the error path of btrfs_submit_chunk(), the original bio is > fully freed before submitted range has a chance to call its endio > function, resulting a use-after-free bug. > > [NEW TEST CASE] > Introduce a new test case to verify the specific behavior by: > > - Create a btrfs with enough csum leaves with data RAID0 profile > To bump the csum tree level, use the minimal nodesize possible (4K). > Writing 32M data which needs at least 8 leaves for data checksum > > RAID0 profile ensures the data read bios will get split. > > - Find the last csum tree leave and corrupt it > > - Read the data many times until we trigger the bug or exit gracefully > With an x86_64 VM with KASAN enabled, it can trigger the KASAN report in > just 4 iterations (the default iteration number is 32). > > Signed-off-by: Qu Wenruo <wqu@xxxxxxxx> Fine now, thanks. Reviewed-by: Filipe Manana <fdmanana@xxxxxxxx> > --- > Changelog: > v3: > - Remove the unrelated btrfs/125 references > There is nothing specific to RAID56, it's just a coincident that > btrfs/125 leads us to the bug. > Since we have a more comprehensive understanding of the bug, there is > no need to mention it at all. > > - More grammar fixes > - Use proper _check_btrfs_raid_type() to verify raid0 support > - Update the title to be more specific about the test case > - Renumber to btrfs/321 to avoid conflicts with an new test case > - Remove unnecessary 'sync' which is followed by unmount > - Use full subcommand name "inspect-internal" > - Explain why we want to fail early if hitting the bug > - Remove unnecessary `_require_scratch` which is duplicated to > `_require_scratch_nocheck` > > v2: > - Fix the wrong commit hash > The proper fix is not yet merged, the old hash is a place holder > copied from another test case but forgot to remove. > > - Minor wording update > > - Add to "dangerous" group > --- > tests/btrfs/321 | 83 +++++++++++++++++++++++++++++++++++++++++++++ > tests/btrfs/321.out | 2 ++ > 2 files changed, 85 insertions(+) > create mode 100755 tests/btrfs/321 > create mode 100644 tests/btrfs/321.out > > diff --git a/tests/btrfs/321 b/tests/btrfs/321 > new file mode 100755 > index 000000000000..e30199daa0d0 > --- /dev/null > +++ b/tests/btrfs/321 > @@ -0,0 +1,83 @@ > +#! /bin/bash > +# SPDX-License-Identifier: GPL-2.0 > +# Copyright (C) 2024 SUSE Linux Products GmbH. All Rights Reserved. > +# > +# FS QA Test 321 > +# > +# Make sure there are no use-after-free, crashes, deadlocks etc, when reading data > +# which has its data checksums in a corrupted csum tree block. > +# > +. ./common/preamble > +_begin_fstest auto quick raid dangerous > + > +_require_scratch_nocheck > +_require_scratch_dev_pool 2 > + > +# Use RAID0 for data to get bio split according to stripe boundary. > +# This is required to trigger the bug. > +_require_btrfs_raid_type raid0 > + > +# This test goes 4K sectorsize and 4K nodesize, so that we easily create > +# higher level of csum tree. > +_require_btrfs_support_sectorsize 4096 > +_require_btrfs_command inspect-internal dump-tree > + > +_fixed_by_kernel_commit xxxxxxxxxxxx \ > + "btrfs: fix a use-after-free bug when hitting errors inside btrfs_submit_chunk()" > + > +# The bug itself has a race window, run this many times to ensure triggering. > +# On an x86_64 VM with KASAN enabled, normally it is triggered before the 10th run. > +iterations=32 > + > +_scratch_pool_mkfs "-d raid0 -m single -n 4k -s 4k" >> $seqres.full 2>&1 > +# This test requires data checksum to trigger the bug. > +_scratch_mount -o datasum,datacow > + > +# For the smallest csum size (CRC32C) it's 4 bytes per 4K, writing 32M data > +# will need 32K data checksum at minimal, which is at least 8 leaves. > +_pwrite_byte 0xef 0 32m "$SCRATCH_MNT/foobar" > /dev/null > +_scratch_unmount > + > + > +# Search for the last leaf of the csum tree, that will be the target to destroy. > +$BTRFS_UTIL_PROG inspect-internal dump-tree -t 7 $SCRATCH_DEV >> $seqres.full > +target_bytenr=$($BTRFS_UTIL_PROG inspect-internal dump-tree -t 7 $SCRATCH_DEV | grep "leaf.*flags" | sort | tail -n1 | cut -f2 -d\ ) > + > +if [ -z "$target_bytenr" ]; then > + _fail "unable to locate the last csum tree leaf" > +fi > + > +echo "bytenr of csum tree leaf to corrupt: $target_bytenr" >> $seqres.full > + > +# Corrupt that csum tree block. > +physical=$(_btrfs_get_physical "$target_bytenr" 1) > +dev=$(_btrfs_get_device_path "$target_bytenr" 1) > + > +echo "physical bytenr: $physical" >> $seqres.full > +echo "physical device: $dev" >> $seqres.full > + > +_pwrite_byte 0x00 "$physical" 4 "$dev" > /dev/null > + > +for (( i = 0; i < $iterations; i++ )); do > + echo "=== run $i/$iterations ===" >> $seqres.full > + _scratch_mount -o ro > + # Since the data is on RAID0, read bios will be split at the stripe > + # (64K sized) boundary. If csum lookup failed due to corrupted csum > + # tree, there is a race window that can lead to double bio freeing > + # (triggering KASAN at least). > + cat "$SCRATCH_MNT/foobar" &> /dev/null > + _scratch_unmount > + > + # Instead of relying on the final _check_dmesg() to find errors, > + # error out immediately if KASAN is triggered. > + # As non-triggering runs will generate quite some error messages, > + # making investigation much harder. > + if _check_dmesg_for "BUG" ; then > + _fail "Critical error(s) found in dmesg" > + fi > +done > + > +echo "Silence is golden" > + > +status=0 > +exit > diff --git a/tests/btrfs/321.out b/tests/btrfs/321.out > new file mode 100644 > index 000000000000..290a5eb31312 > --- /dev/null > +++ b/tests/btrfs/321.out > @@ -0,0 +1,2 @@ > +QA output created by 321 > +Silence is golden > -- > 2.46.0 > >