Re: [PATCH] btrfs/276: allow a slight increase in the number of extents

[Date Prev] [Date Next] [Thread Prev] [Thread Next] [Date Index] [Thread Index]



On Thu, Aug 3, 2023 at 10:02 AM Filipe Manana <fdmanana@xxxxxxxxxx> wrote:
>
> On Thu, Aug 3, 2023 at 2:30 AM Qu Wenruo <quwenruo.btrfs@xxxxxxx> wrote:
> >
> >
> >
> > On 2023/8/3 01:28, Filipe Manana wrote:
> > > On Wed, Aug 2, 2023 at 12:18 PM Qu Wenruo <quwenruo.btrfs@xxxxxxx> wrote:
> > >>
> > >>
> > >>
> > >> On 2023/8/2 18:36, Qu Wenruo wrote:
> > >>>
> > >>>
> > >>> On 2023/8/2 18:23, Filipe Manana wrote:
> > >>>> On Tue, Aug 1, 2023 at 8:16 AM Qu Wenruo <wqu@xxxxxxxx> wrote:
> > >>>>>
> > >>>>> [BUG]
> > >>>>> Sometimes test case btrfs/276 would fail with extra number of extents:
> > >>>>>
> > >>>>>       - output mismatch (see /opt/xfstests/results//btrfs/276.out.bad)
> > >>>>>       --- tests/btrfs/276.out     2023-07-19 07:24:07.000000000 +0000
> > >>>>>       +++ /opt/xfstests/results//btrfs/276.out.bad        2023-07-28
> > >>>>> 04:15:06.223985372 +0000
> > >>>>>       @@ -1,16 +1,16 @@
> > >>>>>        QA output created by 276
> > >>>>>        wrote 17179869184/17179869184 bytes at offset 0
> > >>>>>        XXX Bytes, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
> > >>>>>       -Number of non-shared extents in the whole file: 131072
> > >>>>>       +Number of non-shared extents in the whole file: 131082
> > >>>>>        Create a snapshot of 'SCRATCH_MNT' in 'SCRATCH_MNT/snap'
> > >>>>>       -Number of shared extents in the whole file: 131072
> > >>>>>       ...
> > >>>>>       (Run 'diff -u /opt/xfstests/tests/btrfs/276.out
> > >>>>> /opt/xfstests/results//btrfs/276.out.bad'  to see the entire diff)
> > >>>>>
> > >>>>> [CAUSE]
> > >>>>> The test case uses golden output to record the number of total extents
> > >>>>> of a 16G file.
> > >>>>>
> > >>>>> This is not reliable as we can have writeback happen halfway, resulting
> > >>>>> smaller extents thus slightly more extents.
> > >>>>>
> > >>>>> With a VM with 4G memory, I have a chance around 1/10 hitting this
> > >>>>> false alert.
> > >>>>>
> > >>>>> [FIX]
> > >>>>> Instead of using golden output, we allow a slight (5%) float in the
> > >>>>> number of extents, and move the 131072 (and 131072 - 16) from golden
> > >>>>> output, so even if we have a slightly more extents, we can still pass
> > >>>>> the test.
> > >>>>>
> > >>>>> Signed-off-by: Qu Wenruo <wqu@xxxxxxxx>
> > >>>>> ---
> > >>>>>    tests/btrfs/276     | 41 ++++++++++++++++++++++++++++++++++++-----
> > >>>>>    tests/btrfs/276.out |  4 ----
> > >>>>>    2 files changed, 36 insertions(+), 9 deletions(-)
> > >>>>>
> > >>>>> diff --git a/tests/btrfs/276 b/tests/btrfs/276
> > >>>>> index 944b0c8f..a63b28bb 100755
> > >>>>> --- a/tests/btrfs/276
> > >>>>> +++ b/tests/btrfs/276
> > >>>>> @@ -65,10 +65,17 @@ count_not_shared_extents()
> > >>>>>
> > >>>>>    # Create a 16G file as that results in 131072 extents, all with a
> > >>>>> size of 128K
> > >>>>>    # (due to compression), and a fs tree with a height of 3 (root node
> > >>>>> at level 2).
> > >>>>> +#
> > >>>>> +# But due to writeback can happen halfway, we may have slightly more
> > >>>>> extents
> > >>>>> +# than 128K, so we allow 5% increase in the number of extents.
> > >>>>> +#
> > >>>>>    # We want to verify later that fiemap correctly reports the
> > >>>>> sharedness of each
> > >>>>>    # extent, even when it needs to switch from one leaf to the next
> > >>>>> one and from a
> > >>>>>    # node at level 1 to the next node at level 1.
> > >>>>>    #
> > >>>>> +nr_extents_lower=$((128 * 1024))
> > >>>>> +nr_extents_upper=$((128 * 1024 + 128 * 1024 / 20))
> > >>>>> +
> > >>>>>    $XFS_IO_PROG -f -c "pwrite -b 8M 0 16G" $SCRATCH_MNT/foo |
> > >>>>> _filter_xfs_io
> > >>>>
> > >>>> Does adding '-s' (fsync after every write) to the $XFS_IO_PROG fixes
> > >>>> the issue?
> > >>>> On my test vm, it doesn't increase runtime by that much (16 to 23
> > >>>> seconds).
> > >>>
> > >>> This indeed is much better, without dirty pages pressure we can go the
> > >>> old golden output.
> > >>
> > >> Unfortunately I still have a very low chance (~1/20) to hit 1~3 more
> > >> extent than the golden output.
> > >>
> > >> There are still extra things like the commit intervals to let us to
> > >> writeback halfway.
> > >>
> > >> The best situation would be direct IO, but unfortunately direct IO
> > >> doesn't support compression, thus the resulted file would lead to merged
> > >> fiemap results.
> > >
> > > The compression is not needed.
> > > I wrote the test to use compression because it makes creating a file with
> > > thousands of extents very fast and using very little space.
> > >
> > > The goal is really to have many file extent items spanning multiple leaves and
> > > to have a root at level 2 (or higher), to verify the sharedness
> > > detection is correct
> > > for subtrees.
> > >
> > >>
> > >> The other solution is to write between two files using direct IO, to
> > >> make each extent inside the same file not continuous with each other.
> > >>
> > >> But that would lead to at least 512M * 2, and we also need to do the
> > >> same interleaved writes again for the 1M writes.
> > >>
> > >> Any ideas would be appreciated.
> > >
> > > See if this works:
> > >
> > > https://pastebin.com/raw/QnNtSrDP
> >
> > This works fine as after 20 runs I still didn't hit any output mismatch.
> >
> > Although you can further improve it by using direct IO and much smaller
> > blocksize (4K), which can further reduce the size and completely
> > eliminate the possibility of writeback halfway.
> > (This would require 4K sector size though)
>
> The 64K was chosen precisely so that it works with any sector size.
> I don't know why you worry about the size - it's a ~250M file, quite small.
>
> >
> > In that case, you can even reuse the old 131072 number, as we only need
> > around 512M (file size would be 1G) for data.
> >
> >
> > Another thing is, if tree level 3 (root node level 2) is needed, we can
> > even further reduce the file size by requiring 4K nodesize, which can
> > further speed up the test, without the need for xattr.
>
> Yes, but I prefer to keep the default node size, so that it works on
> any environment and
> configuration, therefore increasing test coverage.
>
> >
> > >
> > > It accomplishes the same goals and it's still fast (about 23 seconds
> > > on my non-debug test vm, before it was about 16 seconds).
> >
> > The modified version is already fast enough even with a KASAN enabled
> > kernel, only 60s compared to the old 120+s.
> >
> > Mind to send the modification as a proper fix? That's a much improved
> > version compared to mine.
>
> I will, thanks.

For cross reference, here it is:

https://lore.kernel.org/linux-btrfs/c54bf70be6bbeefe440ea5b1341495b16803455c.1691058187.git.fdmanana@xxxxxxxx/

>
> >
> > Thanks,
> > Qu
> >
> > >
> > > Thanks.
> > >
> > >>
> > >> Thanks,
> > >> Qu
> > >>>
> > >>> Thanks,
> > >>> Qu
> > >>>>
> > >>>> I'd rather do that so that we can be sure fiemap is working correctly
> > >>>> and not returning more extents than there really are - this approach
> > >>>> of allowing a bit more allows for that type of bug to be unnoticed,
> > >>>> plus that little bit more might not be always enough (depending on
> > >>>> available rm, writeback settings, etc).
> > >>>>
> > >>>> Thanks.
> > >>>>
> > >>>>>
> > >>>>>    # Sync to flush delalloc and commit the current transaction, so
> > >>>>> fiemap will see
> > >>>>> @@ -76,13 +83,22 @@ $XFS_IO_PROG -f -c "pwrite -b 8M 0 16G"
> > >>>>> $SCRATCH_MNT/foo | _filter_xfs_io
> > >>>>>    sync
> > >>>>>
> > >>>>>    # All extents should be reported as non shared (131072 extents).
> > >>>>> -echo "Number of non-shared extents in the whole file:
> > >>>>> $(count_not_shared_extents)"
> > >>>>> +found1=$(count_not_shared_extents)
> > >>>>> +echo "Number of non-shared extents in the whole file: ${found1}" >>
> > >>>>> $seqres.full
> > >>>>> +
> > >>>>> +if [ $found1 -lt $nr_extents_lower -o $found1 -gt $nr_extents_upper
> > >>>>> ]; then
> > >>>>> +       echo "unexpected initial number of extents, has $found1
> > >>>>> expect [$nr_extents_lower, $nr_extents_upper]"
> > >>>>> +fi
> > >>>>>
> > >>>>>    # Creating a snapshot.
> > >>>>>    $BTRFS_UTIL_PROG subvolume snapshot $SCRATCH_MNT $SCRATCH_MNT/snap
> > >>>>> | _filter_scratch
> > >>>>>
> > >>>>>    # We have a snapshot, so now all extents should be reported as shared.
> > >>>>> -echo "Number of shared extents in the whole file:
> > >>>>> $(count_shared_extents)"
> > >>>>> +found2=$(count_shared_extents)
> > >>>>> +echo "Number of shared extents in the whole file: ${found2}" >>
> > >>>>> $seqres.full
> > >>>>> +if [ $found2 -ne $found1 ]; then
> > >>>>> +       echo "unexpected shared extents, has $found2 expect $found1"
> > >>>>> +fi
> > >>>>>
> > >>>>>    # Now COW two file ranges, of 1M each, in the snapshot's file.
> > >>>>>    # So 16 extents should become non-shared after this.
> > >>>>> @@ -97,8 +113,18 @@ sync
> > >>>>>
> > >>>>>    # Now we should have 16 non-shared extents and 131056 (131072 - 16)
> > >>>>> shared
> > >>>>>    # extents.
> > >>>>> -echo "Number of non-shared extents in the whole file:
> > >>>>> $(count_not_shared_extents)"
> > >>>>> -echo "Number of shared extents in the whole file:
> > >>>>> $(count_shared_extents)"
> > >>>>> +found3=$(count_not_shared_extents)
> > >>>>> +found4=$(count_shared_extents)
> > >>>>> +echo "Number of non-shared extents in the whole file: ${found3}"
> > >>>>> +echo "Number of shared extents in the whole file: ${found4}" >>
> > >>>>> $seqres.full
> > >>>>> +
> > >>>>> +if [ $found3 != 16 ]; then
> > >>>>> +       echo "Unexpected number of non-shared extents, has $found3
> > >>>>> expect 16"
> > >>>>> +fi
> > >>>>> +
> > >>>>> +if [ $found4 != $(( $found1 - $found3 )) ]; then
> > >>>>> +       echo "Unexpected number of shared extents, has $found4 expect
> > >>>>> $(( $found1 - $found3 ))"
> > >>>>> +fi
> > >>>>>
> > >>>>>    # Check that the non-shared extents are indeed in the expected file
> > >>>>> ranges (each
> > >>>>>    # with 8 extents).
> > >>>>> @@ -117,7 +143,12 @@ _scratch_remount commit=1
> > >>>>>    sleep 1.1
> > >>>>>
> > >>>>>    # Now all extents should be reported as not shared (131072 extents).
> > >>>>> -echo "Number of non-shared extents in the whole file:
> > >>>>> $(count_not_shared_extents)"
> > >>>>> +found5=$(count_not_shared_extents)
> > >>>>> +echo "Number of non-shared extents in the whole file: ${found5}" >>
> > >>>>> $seqres.full
> > >>>>> +
> > >>>>> +if [ $found5 != $found1 ]; then
> > >>>>> +       echo "Unexpected final number of non-shared extents, has
> > >>>>> $found5 expect $found1"
> > >>>>> +fi
> > >>>>>
> > >>>>>    # success, all done
> > >>>>>    status=0
> > >>>>> diff --git a/tests/btrfs/276.out b/tests/btrfs/276.out
> > >>>>> index 3bf5a5e6..e318c2e9 100644
> > >>>>> --- a/tests/btrfs/276.out
> > >>>>> +++ b/tests/btrfs/276.out
> > >>>>> @@ -1,16 +1,12 @@
> > >>>>>    QA output created by 276
> > >>>>>    wrote 17179869184/17179869184 bytes at offset 0
> > >>>>>    XXX Bytes, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
> > >>>>> -Number of non-shared extents in the whole file: 131072
> > >>>>>    Create a snapshot of 'SCRATCH_MNT' in 'SCRATCH_MNT/snap'
> > >>>>> -Number of shared extents in the whole file: 131072
> > >>>>>    wrote 1048576/1048576 bytes at offset 8388608
> > >>>>>    XXX Bytes, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
> > >>>>>    wrote 1048576/1048576 bytes at offset 12884901888
> > >>>>>    XXX Bytes, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
> > >>>>>    Number of non-shared extents in the whole file: 16
> > >>>>> -Number of shared extents in the whole file: 131056
> > >>>>>    Number of non-shared extents in range [8M, 9M): 8
> > >>>>>    Number of non-shared extents in range [12G, 12G + 1M): 8
> > >>>>>    Delete subvolume (commit): 'SCRATCH_MNT/snap'
> > >>>>> -Number of non-shared extents in the whole file: 131072
> > >>>>> --
> > >>>>> 2.41.0
> > >>>>>




[Index of Archives]     [Linux Filesystems Development]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux