Re: [PATCH] fstests: btrfs: Test if btrfs will panic when mounting a partially balanced fs

[Date Prev] [Date Next] [Thread Prev] [Thread Next] [Date Index] [Thread Index]




On 2019/5/22 下午5:32, Filipe Manana wrote:
> On Wed, May 22, 2019 at 9:40 AM Qu Wenruo <wqu@xxxxxxxx> wrote:
>>
>> There are two regressions that when mounting a partially balance btrfs
>> after v5.1 kernel:
>> - Kernel NULL pointer dereference at mount time
>> - Kernel BUG_ON() just after mount
>>
>> The kernel fixes are:
>> "btrfs: qgroup: Check if @bg is NULL to avoid NULL pointer
>>  dereference"
>> "btrfs: reloc: Also queue orphan reloc tree for cleanup to
>>  avoid BUG_ON()"
>>
>> Signed-off-by: Qu Wenruo <wqu@xxxxxxxx>
>> ---
>>  tests/btrfs/188     | 94 +++++++++++++++++++++++++++++++++++++++++++++
>>  tests/btrfs/188.out |  2 +
>>  tests/btrfs/group   |  1 +
>>  3 files changed, 97 insertions(+)
>>  create mode 100755 tests/btrfs/188
>>  create mode 100644 tests/btrfs/188.out
>>
>> diff --git a/tests/btrfs/188 b/tests/btrfs/188
>> new file mode 100755
>> index 00000000..f43be007
>> --- /dev/null
>> +++ b/tests/btrfs/188
>> @@ -0,0 +1,94 @@
>> +#! /bin/bash
>> +# SPDX-License-Identifier: GPL-2.0
>> +# Copyright (c) 2019 SUSE Linux Products GmbH.  All Rights Reserved.
>> +#
>> +# FS QA Test 188
>> +#
>> +# Test if btrfs mount will hit the following bugs when mounting
>> +# a fs going through partial balance:
>> +# - NULL pointer dereference
>> +# - Kernel BUG_ON()
> 
> I would make the description be closer to what the test is - a general
> test to validate that balance and qgroups work correctly when balance
> needs to be resumed on mount.
> You can leave those specific problems in the change log.
> 
>> +#
>> +seq=`basename $0`
>> +seqres=$RESULT_DIR/$seq
>> +echo "QA output created by $seq"
>> +
>> +here=`pwd`
>> +tmp=/tmp/$$
>> +status=1       # failure is the default!
>> +trap "_cleanup; exit \$status" 0 1 2 3 15
>> +
>> +_cleanup()
>> +{
>> +       cd /
>> +       rm -f $tmp.*
>> +}
>> +
>> +# get standard environment, filters and checks
>> +. ./common/rc
>> +. ./common/filter
>> +. ./common/dmlogwrites
>> +
>> +# remove previous $seqres.full before test
>> +rm -f $seqres.full
>> +
>> +# real QA test starts here
>> +
>> +# Modify as appropriate.
>> +_supported_fs btrfs
>> +_supported_os Linux
>> +_require_scratch
>> +# and we need extra device as log device
>> +_require_log_writes
>> +
>> +nr_files=512                           # enough metadata to bump tree height
>> +file_size=2048                         # small enough to be inlined
>> +
>> +_log_writes_init $SCRATCH_DEV
>> +_log_writes_mkfs >> $seqres.full 2>&1
>> +
>> +_log_writes_mount
>> +$BTRFS_UTIL_PROG quota enable $SCRATCH_MNT >> $seqres.full
>> +$BTRFS_UTIL_PROG quota rescan -w $SCRATCH_MNT >> $seqres.full
>> +
>> +# Create enough metadata for later balance
>> +for ((i = 0; i < $nr_files; i++)); do
>> +       _pwrite_byte 0xcd 0 $file_size $SCRATCH_MNT/file_$i > /dev/null
>> +done
>> +
>> +# Ensure we write all data/metadata back to disk so that later
>> +# balance will do real I/O
> 
> I don't understand this. Real I/O? Do we have any fake I/O? What is it?

I mean to avoid things like delayed inode where we only have dirty pages
but haven't written them onto disk.

In this case we want some metadata space to be really used. Dirty page
caches can't do that.

What's the proper expression for this?

> 
>> +sync
>> +
>> +# Balance metadata so we will have at least one transaction committed with
>> +# valid reloc tree, and hopefully an orphan reloc tree.
>> +$BTRFS_UTIL_PROG balance start -f -m $SCRATCH_MNT >> $seqres.full
>> +_log_writes_unmount
>> +_log_writes_remove
>> +
>> +cur=$(_log_writes_find_next_fua 0)
>> +echo "cur=$cur" >> $seqres.full
>> +while [ ! -z "$cur" ]; do
>> +       _log_writes_replay_log_range $cur $SCRATCH_DEV >> $seqref.full
>> +
>> +       # If the fs contains valid reloc tree and kernel is not patched,
>> +       # we'll hit a NULL pointer dereference
>> +       # Or if it contains orphan reloc tree and kernel is unpatched,
>> +       # we'll hit a BUG_ON()
> 
> # Test that no crashes happen or any other kind of failure.
> 
>> +       _scratch_mount
>> +       _scratch_unmount
>> +
>> +       # Don't trigger fsck here, as relocation get paused,
>> +       # at that transistent state, qgroup number may differ
>> +       # and cause false alert.
>> +
>> +       prev=$cur
>> +       cur=$(_log_writes_find_next_fua $(($cur + 1)))
>> +       [ -z "$cur" ] && break
>> +done
> 
> After the balance finishes, can we verify that qgroup values are correct?

Isn't  that done automatically when the test is finished?
As i'm using _require_scratch, not _require_scratch_no_check.

> 
>> +
>> +echo "Silence is golden"
>> +
>> +# success, all done
>> +status=0
>> +exit
>> diff --git a/tests/btrfs/188.out b/tests/btrfs/188.out
>> new file mode 100644
>> index 00000000..6f23fda0
>> --- /dev/null
>> +++ b/tests/btrfs/188.out
>> @@ -0,0 +1,2 @@
>> +QA output created by 188
>> +Silence is golden
>> diff --git a/tests/btrfs/group b/tests/btrfs/group
>> index 44ee0dd9..16a7c31e 100644
>> --- a/tests/btrfs/group
>> +++ b/tests/btrfs/group
>> @@ -190,3 +190,4 @@
>>  185 volume
>>  186 auto quick send volume
>>  187 auto send dedupe clone balance
>> +188 auto quick replay
> 
> "balance" and "qgroup" groups as well

Forgot that.

Thanks for comment,
Qu

> 
> Thanks.
>> --
>> 2.21.0
>>
> 
> 

Attachment: signature.asc
Description: OpenPGP digital signature


[Index of Archives]     [Linux Filesystems Development]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux