Re: [PATCH] fstests: btrfs/179 call quota rescan

Qu Wenruo <quwenruo.btrfs@xxxxxxx> · Mon, 10 Feb 2020 09:36:04 +0800

On 2020/2/8 下午5:06, Anand Jain wrote:
> 
> 
> On 2/8/20 7:28 AM, Qu Wenruo wrote:
>>
>>
>> On 2020/2/7 下午11:59, Anand Jain wrote:
>>>
>>>
>>> On 7/2/20 8:15 PM, Qu Wenruo wrote:
>>>>
>>>>
>>>> On 2020/2/7 下午8:01, Anand Jain wrote:
>>>>> On some systems btrfs/179 fails as the check finds that there is
>>>>> difference in the qgroup counts.
>>>>>
>>>>> By the async nature of qgroup tree scan, the latest qgroup counts
>>>>> at the
>>>>> time of umount might not be upto date,
>>>>
>>>> Yes, so far so good.
>>>>
>>>>> if it isn't then the check will
>>>>> report the difference in count. The difference in qgroup counts are
>>>>> anyway
>>>>> updated in the following mount, so it is not a real issue that this
>>>>> test
>>>>> case is trying to verify.
>>>>
>>>> No problem either.
>>>>
>>>>> So make sure the qgroup counts are updated
>>>>> before unmount happens and make the check happy.
>>>>
>>>> But the solution doesn't look correct to me.
>>>>
>>>> We should either make btrfs-check to handle such half-dropped case
>>>> better,
>>>
>>>   Check is ok. The count as check counts matches with the count after
>>> the
>>> mount. So what is recorded in the qgroup is not upto date.
>>
>> Nope. Qgroup records what's in commit tree. For unmounted fs, there is
>> no difference in commit tree and current tree.
>>
>> Thus the qgroup scan in btrfs-progs is different from kernel.
>> Please go check how the btrfs-progs code to see how the difference comes.
>>
>>>
>>>> or find a way to wait for all subvolume drop to be finished in
>>>> test case.
>>>
>>> Yes this is one way. Just wait for few seconds will do, test passes. Do
>>> you know any better way?
>>
>> I didn't remember when, but it looks like `btrfs fi sync` used to wait
>> for snapshot drop.
>> But not now. If we have a way to wait for cleaner to finish, we can
>> solve it pretty easily.
> 
> A sleep at the end of the test case also makes it count consistent.
> As the intention of the test case is to test for the hang, so sleep 5
> at the end of the test case is reasonable.

That looks like a valid workaround.

Although the immediate number 5 looks no that generic for all test
environments.

I really hope to find a stable way to wait for all subvolume drops other
than rely on some hard coded numbers.

Thanks,
Qu

> 
> Thanks, Anand
> 
>> Thanks,
>> Qu
>>
>>>
>>> Thanks, Anand
>>>
>>>> Papering the test by rescan is not a good idea at all.
>>>> If one day we really hit some qgroup accounting problem, this papering
>>>> way could hugely reduce the coverage.
>>>>
>>>
>>>
>>>> Thanks,
>>>> Qu
>>>>
>>>>>
>>>>> Signed-off-by: Anand Jain <anand.jain@xxxxxxxxxx>
>>>>> ---
>>>>>    tests/btrfs/179 | 8 ++++++++
>>>>>    1 file changed, 8 insertions(+)
>>>>>
>>>>> diff --git a/tests/btrfs/179 b/tests/btrfs/179
>>>>> index 4a24ea419a7e..74e91841eaa6 100755
>>>>> --- a/tests/btrfs/179
>>>>> +++ b/tests/btrfs/179
>>>>> @@ -109,6 +109,14 @@ wait $snapshot_pid
>>>>>    kill $delete_pid
>>>>>    wait $delete_pid
>>>>>    +# By the async nature of qgroup tree scan, the latest qgroup
>>>>> counts at the time
>>>>> +# of umount might not be upto date, if it isn't then the check will
>>>>> report the
>>>>> +# difference in count. The difference in qgroup counts are anyway
>>>>> updated in the
>>>>> +# following mount, so it is not a real issue that this test case is
>>>>> trying to
>>>>> +# verify. So make sure the qgroup counts are updated before unmount
>>>>> happens.
>>>>> +
>>>>> +$BTRFS_UTIL_PROG quota rescan -w $SCRATCH_MNT >> $seqres.full
>>>>> +
>>>>>    # success, all done
>>>>>    echo "Silence is golden"
>>>>>   
>>>>
>>

Attachment:
signature.asc

Description: OpenPGP digital signature