Re: [PATCH] btrfs/011: handle finished replace properly

Filipe Manana <fdmanana@xxxxxxxxxx> · Mon, 10 Jan 2022 13:54:07 +0000

On Mon, Jan 10, 2022 at 1:02 PM Qu Wenruo <quwenruo.btrfs@xxxxxxx> wrote:
>
>
>
> On 2022/1/10 20:52, Filipe Manana wrote:
> > On Mon, Jan 10, 2022 at 12:44 PM Qu Wenruo <quwenruo.btrfs@xxxxxxx> wrote:
> >>
> >>
> >>
> >> On 2022/1/10 19:57, Filipe Manana wrote:
> >>> On Mon, Jan 10, 2022 at 07:28:48PM +0800, Qu Wenruo wrote:
> >>>> [BUG]
> >>>> When running btrfs/011 inside VM which has unsafe cache set for its
> >>>> devices, and the host have enough memory to cache all the IO:
> >>>
> >>> Btw, I use the same setup and the test never fails for me.
> >>> It's quite of a strech assuming that using unsafe caching and having
> >>> enough memory are the only reasons for the failure.
> >>>
> >>> I use a debug kernel with plenty of heavy debug features enabled,
> >>> so that may be one reason why I can't trigger it.
> >>
> >> I guess that's the reason.
> >>
> >> For most cases I only enable light-weight debug features for regular runs.
> >> The heavy ones like KASAN/kmemleak are reserved for more tricky cases.
> >>
> >>>
> >>>>
> >>>> btrfs/011 98s ... [failed, exit status 1]- output mismatch
> >>>>       --- tests/btrfs/011.out 2019-10-22 15:18:13.962298674 +0800
> >>>>       +++ /xfstests-dev/results//btrfs/011.out.bad    2022-01-10 19:12:14.683333251 +0800
> >>>>       @@ -1,3 +1,4 @@
> >>>>        QA output created by 011
> >>>>        *** test btrfs replace
> >>>>       -*** done
> >>>>       +failed: '/usr/bin/btrfs replace cancel /mnt/scratch'
> >>>>       +(see /xfstests-dev/results//btrfs/011.full for details)
> >>>>       ...
> >>>> Ran: btrfs/011
> >>>> Failures: btrfs/011
> >>>> Failed 1 of 1 tests
> >>>>
> >>>> [CAUSE]
> >>>> Although commit fa85aa64 ("btrfs/011: Fill the fs to ensure we
> >>>> have enough data for dev-replace") tries to address the problem by
> >>>> filling the fs with extra content, there is still no guarantee that 2
> >>>> seconds of IO still needs 2 seconds to finish.
> >>>>
> >>>> Thus even we tried our best to make sure the replace will take 2
> >>>> seconds, it can still finish faster than 2 seconds.
> >>>>
> >>>> And just to mention how fast the test finishes, after the fix, the test
> >>>> takes around 90~100 seconds to finish.
> >>>> While on real-hardware it can take over 1000 seconds.
> >>>>
> >>>> [FIX]
> >>>> Instead of further enlarging the IO, here we just accept the fact that
> >>>> replace can finish faster than our expectation, and continue the test.
> >>>
> >>> If I'm reading this, and the code, correctly this means we end up never
> >>> testing that the replace cancel feature ends up being exercised and
> >>> while a replace is in progress. That's far from ideal, losing test
> >>> coverage.
> >>
> >> Yep, you're completely right.
> >>
> >> In fact for my environment, only around 10% runs really utilized the
> >> cancel path, the remaining 90% go finished routine.
> >>
> >> But...
> >>
> >>>
> >>> Josef sent a patch for this last month:
> >>>
> >>> https://lore.kernel.org/linux-btrfs/01796d6bcec40ae80b5af3269e60a66cd4b89262.1638382763.git.josef@xxxxxxxxxxxxxx/
> >>>
> >>> Don't know why it wasn't merged. I agree that changing timings in the
> >>> test is not ideal and always prone to still fail on some setups, but
> >>> seems more acceptable to me rather than losing test coverage.
> >>
> >> My concern is, it's just going to be another whac-a-mole game.
> >>
> >> With more RAM in the host, and further optimization, I'm not sure if 10
> >> seconds is enough.
> >> Furthermore 10 seconds may be too long for certain environment, to fill
> >> the whole test filesystem and cause other false alerts.
> >>
> >> As a safenet, we have dedicated cancel workload test in btrfs/212, and
> >
> > btrfs/212 is about balance, not device replace.
>
> Oh, my bad.
>
> Then I guess it's time for us to have dedicated replace and cancel test
> case.
>
> >
> >> IIRC for most (if not all) bugs exposed in btrfs/011, it's in the
> >> regular replace path, not the cancel path.
> >
> > Even if we have had few bugs in the replace cancel patch, compared to
> > the regular replace path,
> > that's not an excuse to remove or reduce replace cancel coverage.
> >
> >>
> >> Thus I want to end the whac-a-hole game once and for all, even it will
> >> drop the coverage for super fast setup.
> >
> > If everyone starts having a setup that is faster, then we will end up
> > not getting replace cancel covered in the long run,
> > and increasing the chances of finding out regressions only after
> > kernels are released, by users.
> >
> > As I said earlier, I agree that the whac-a-mole style of change sent
> > by Josef is far from ideal, but at least it doesn't lose test
> > coverage.
>
> It still doesn't solve the possible false alerts, it's doing masking,
> just a different way.
>
> But that masking will eventually hit ENOSPC and cause a different false
> alerts.
>
>
> If we have dedicated replace cancel tests, can we remove the false-alert
> prone cancel test in btrfs/011?

In the end it's the same problem however, making sure that by the time
the cancel is requested, the replace operation is still in progress.

I don't see how making that in a separate test case will be more
reliable, unless you're considering something like dm delay (and in
that case why can't it be made in btrfs/011).

Thanks.

>
> Thanks,
> Qu
>
> >
> > Thanks.
> >
> >
> >
> >>
> >> Thanks,
> >> Qu
> >>
> >>>
> >>> Thanks.
> >>>
> >>>>
> >>>> One thing to notice is, since the replace finished, we need to replace
> >>>> back the device, or later fsck will be executed on blank device, and
> >>>> cause false alert.
> >>>>
> >>>> Signed-off-by: Qu Wenruo <wqu@xxxxxxxx>
> >>>> ---
> >>>>    tests/btrfs/011 | 15 +++++++++++++--
> >>>>    1 file changed, 13 insertions(+), 2 deletions(-)
> >>>>
> >>>> diff --git a/tests/btrfs/011 b/tests/btrfs/011
> >>>> index b4673341..aae89696 100755
> >>>> --- a/tests/btrfs/011
> >>>> +++ b/tests/btrfs/011
> >>>> @@ -171,13 +171,24 @@ btrfs_replace_test()
> >>>>               # background the replace operation (no '-B' option given)
> >>>>               _run_btrfs_util_prog replace start -f $replace_options $source_dev $target_dev $SCRATCH_MNT
> >>>>               sleep $wait_time
> >>>> -            _run_btrfs_util_prog replace cancel $SCRATCH_MNT
> >>>> +            $BTRFS_UTIL_PROG replace cancel $SCRATCH_MNT 2>&1 >> $seqres.full
> >>>>
> >>>>               # 'replace status' waits for the replace operation to finish
> >>>>               # before the status is printed
> >>>>               $BTRFS_UTIL_PROG replace status $SCRATCH_MNT > $tmp.tmp 2>&1
> >>>>               cat $tmp.tmp >> $seqres.full
> >>>> -            grep -q canceled $tmp.tmp || _fail "btrfs replace status (canceled) failed"
> >>>> +
> >>>> +            # There is no guarantee we canceled the replace, it can finish
> >>>> +            if grep -q 'finished' $tmp.tmp ; then
> >>>> +                    # The replace finished, we need to replace it back or
> >>>> +                    # later fsck will report error as $SCRATCH_DEV is now
> >>>> +                    # blank
> >>>> +                    $BTRFS_UTIL_PROG replace start -Bf $target_dev \
> >>>> +                            $source_dev $SCRATCH_MNT > /dev/null
> >>>> +            else
> >>>> +                    grep -q 'canceled' $tmp.tmp || _fail \
> >>>> +                            "btrfs replace status (canceled ) failed"
> >>>> +            fi
> >>>>       else
> >>>>               if [ "${quick}Q" = "thoroughQ" ]; then
> >>>>                       # The thorough test runs around 2 * $wait_time seconds.
> >>>> --
> >>>> 2.34.1
> >>>>