Re: [PATCH v2 3/3] generic/470: add syncfs test

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



> 
> 在 2017年12月7日,下午4:17,Amir Goldstein <amir73il@xxxxxxxxx> 写道:
> 
> On Thu, Dec 7, 2017 at 9:42 AM, Chengguang Xu <cgxu519@xxxxxxxxxx> wrote:
>> 在 2017年12月7日,下午3:13,Eryu Guan <eguan@xxxxxxxxxx> 写道:
>>> 
>>> On Thu, Dec 07, 2017 at 02:20:26PM +0800, Chengguang Xu wrote:
>>>>> 
>>>>> 在 2017年12月7日,下午1:44,Eryu Guan <eguan@xxxxxxxxxx> 写道:
>>>>> 
>>>>> On Thu, Dec 07, 2017 at 10:22:07AM +0800, Chengguang Xu wrote:
>>>>>> Inspired by syncfs bug of overlayfs which does not sync dirtyinodes in
>>>>>> underlying filesystem.
>>>>>> Run syncfs and shutdown filesystem(or underlying filesystem of overlayfs)
>>>>>> to check syncfs result.
>>>>>> 
>>>>>> Signed-off-by: Chengguang Xu <cgxu519@xxxxxxxxxx>
>>>>>> ---
>>>>>> 
>>>>>> Changes since v1:
>>>>>> Use fs shutdown and fssum to check syncfs result instead of
>>>>>> checking delalloc state of extents.
>>>>>> 
>>>>>> tests/generic/470     | 88 +++++++++++++++++++++++++++++++++++++++++++++++++++
>>>>>> tests/generic/470.out |  2 ++
>>>>>> tests/generic/group   |  1 +
>>>>>> 3 files changed, 91 insertions(+)
>>>>>> create mode 100755 tests/generic/470
>>>>>> create mode 100644 tests/generic/470.out
>>>>>> 
>>>>>> diff --git a/tests/generic/470 b/tests/generic/470
>>>>>> new file mode 100755
>>>>>> index 0000000..b488747
>>>>>> --- /dev/null
>>>>>> +++ b/tests/generic/470
>>>>>> @@ -0,0 +1,88 @@
>>>>>> +#! /bin/bash
>>>>>> +# FS QA Test 470
>>>>>> +#
>>>>>> +# Inspired by syncfs bug of overlayfs which does not sync dirty inodes in
>>>>>> +# underlying filesystem.
>>>>> 
>>>>> Trailing whitespace in above line.
>>>>> 
>>>>>> +#
>>>>>> +# Run syncfs and shutdown filesystem(or underlying filesystem of overlayfs)
>>>>>> +# to check syncfs result.
>>>>>> +#
>>>>>> +# Test will be skipped if filesystem(or underlying filesystem of overlayfs)
>>>>>> +# does not support shutdown.
>>>>>> +#
>>>>>> +#-----------------------------------------------------------------------
>>>>>> +# Copyright (c) 2017 Chengguang Xu <cgxu519@xxxxxxxxxx>
>>>>>> +# All Rights Reserved.
>>>>>> +#
>>>>>> +# This program is free software; you can redistribute it and/or
>>>>>> +# modify it under the terms of the GNU General Public License as
>>>>>> +# published by the Free Software Foundation.
>>>>>> +#
>>>>>> +# This program is distributed in the hope that it would be useful,
>>>>>> +# but WITHOUT ANY WARRANTY; without even the implied warranty of
>>>>>> +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
>>>>>> +# GNU General Public License for more details.
>>>>>> +#
>>>>>> +# You should have received a copy of the GNU General Public License
>>>>>> +# along with this program; if not, write the Free Software Foundation,
>>>>>> +# Inc.,  51 Franklin St, Fifth Floor, Boston, MA  02110-1301  USA
>>>>>> +#-----------------------------------------------------------------------
>>>>>> +#
>>>>>> +
>>>>>> +seq=`basename $0`
>>>>>> +seqres=$RESULT_DIR/$seq
>>>>>> +echo "QA output created by $seq"
>>>>>> +
>>>>>> +here=`pwd`
>>>>>> +tmp=/tmp/$$
>>>>>> +status=1
>>>>>> +trap "_cleanup; exit \$status" 0 1 2 3 15
>>>>>> +
>>>>>> +_cleanup()
>>>>>> +{
>>>>>> +  cd /
>>>>>> +  rm -f $tmp.*
>>>>>> +}
>>>>>> +
>>>>>> +# get standard environment, filters and checks
>>>>>> +. ./common/rc
>>>>>> +. ./common/filter
>>>>>> +
>>>>>> +# remove previous $seqres.full before test
>>>>>> +rm -f $seqres.full
>>>>>> +
>>>>>> +# real QA test starts here
>>>>>> +
>>>>>> +_supported_fs generic
>>>>>> +_supported_os Linux
>>>>>> +_require_test
>>>>>> +_require_fssum
>>>>>> +_require_scratch
>>>>>> +_require_scratch_shutdown
>>>>>> +_require_xfs_io_command "syncfs"
>>>>>> +
>>>>>> +
>>>>>> +FCNT=1000
>>>>>> +
>>>>>> +_scratch_mkfs >/dev/null 2>&1
>>>>>> +_scratch_mount
>>>>>> +
>>>>>> +# In order to mitigate interference of write-back,
>>>>>> +# create many files for test.
>>>>> 
>>>>> Sorry, I still don't understand how writeback could interfere this test
>>>>> from this comment, what happens if we don't create such files? Why
>>>>> writing files starting from offset 1k?
>>>> 
>>>> There is no explicit explanation how writeback interferes this case,
>>>> also there are many triggers make writeback starts syncing work.
>>>> I just want to increase hit ratio of failure by make many test files,
>>>> as many as possible, but it’s also limited by time and other resource.
>>>> 
>>>> The reason of offset 1k is same as above, compare to test a normal file,
>>>> I think file with hole can increase failure ratio sometimes.
>>> 
>>> Yeah, increasing the reproducibility would be a good reason too. Do you
>>> happen to tune the number of files to see if 1000 is a good fit? e.g.
>>> with 100 files test reproduced the overlay bug 20% of times, with 1000
>>> files the reproducibility increased to 80%, etc. And the hole in the
>>> beginning too, what's the actual impact on the reproducibility?
>>> 
>>> And you're right about the test time, usually we want to balance between
>>> test time and reproducibility too, so we need to tune and measure the
>>> numbers like test files, loop counts etc.
>>> 
>>> I think these are all good comments for test :)
>> 
>> I didn’t do much accurate testing about reproducibility, and also don’t have
>> a plan to do that. Actually in my testing ENV, the BUG is always reproducible
>> even for only a few of test files.
> 
> What is the underlying fs you are testing with?

I usually use xfs as underlying.

> 
>> 1000 is just my definition for
>> quite many files, and it’s really hard and meaningless to guess what number is
>> best suit for variety of testing ENVs. So if you think the number 1000 is
>> improper number, then I can modify it to right number that you think. Hole is
>> the same.
>> 
> 
> I agree with Eryu that you should not invent numbers, unless test completes
> in a few seconds and reproduces reliably - then you can invent numbers...
> 
> But this got me thinking about the details of overlayfs syncfs bug.
> Overlayfs syncfs *will* actually call underlying fs syncfs (I fixed that)
> but *will not* flush dirty inode data. What that means depends on
> the underlying fs. For ext4 with default jounal=ordered, syncfs will
> commit uncommitted journal transactions to disk, that will force data writeback
> for all inodes, whose *metadata* is modified in uncommitted transactions.
> 
> Ted, please correct me if I am wrong.
> 
> For xfs, situation is a bit different, create operations are also delayed,
> so your test could fail on overlayfs over xfs more easily.
> 
> I think that means that if you create the files and write them in the same
> transaction, syncfs *will* actually sync on inodes data, so it is anyway
> only the very last files that you write that won't be flushed no matter how
> many files you will write.
> What you should try to do to increase the changes of the bug on more fs:
> - create new files and truncate them to final size but leaving them sparse
> - sync
> - buffered write to all files
> - syncfs
> - shutdown
> - cycle_mount
> 
> The rules for choosing the right amount of files/data should be:
> - After first sync, all inodes in the system is not dirty
> - During the time it takes to write all files, flusher thread may kick in
>  (default 30 seconds), flush data of your tests files and interfere with
>  the test. Lets rule out another user doing sync, because most test
>  machines are VMs that just run the tests
> - If one loop iteration of the test above takes less than X second for a
>  valid fs on a slow disk, then if flusher thread does interfere, its
>  interference will be over in less then X second
>  (because we are the only ones dirtying data on the system).
> - If X < 10 seconds, then running 2 or 3 loops of the iteration should
>  be enough to guaranty that we run at least one iteration without
>  interference.

I did more detail tests for three different data modes of ext4 and found
the overlayfs syncfs bug is reproducible on data=ordered and data=writeback,
but on data=journal mode, data is flushed and correct. I only wrote only a few words
to a single file and the bug is always reproducible on my test environment. 

For writeback interferences, AFAIK, from dirty ratio and period.
If we drop all dirty caches & sync before the test, I think we can
avoid interference from it.

So if we don’t have anything else to interference test result, 
I just want to modify to write a small single file as test target.

Am I missing anything?

> 
> So I think if you actually write as little files and data as possible,
> but loop several times.
> IMO, tune the parameters, so an iteration takes ~1 second on slow disk
> and run 3 iterations.
> I have a test setup with spinning disk I can run your test on xfs/ext4 if
> you like more samples then your own test environment.
> 
> And please write comments about those heuristics...
> 
> Thanks,
> Amir.

--
To unsubscribe from this list: send the line "unsubscribe linux-unionfs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Linux Filesystems Devel]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux