On Thu, Aug 3, 2023 at 2:30 AM Qu Wenruo <quwenruo.btrfs@xxxxxxx> wrote: > > > > On 2023/8/3 01:28, Filipe Manana wrote: > > On Wed, Aug 2, 2023 at 12:18 PM Qu Wenruo <quwenruo.btrfs@xxxxxxx> wrote: > >> > >> > >> > >> On 2023/8/2 18:36, Qu Wenruo wrote: > >>> > >>> > >>> On 2023/8/2 18:23, Filipe Manana wrote: > >>>> On Tue, Aug 1, 2023 at 8:16 AM Qu Wenruo <wqu@xxxxxxxx> wrote: > >>>>> > >>>>> [BUG] > >>>>> Sometimes test case btrfs/276 would fail with extra number of extents: > >>>>> > >>>>> - output mismatch (see /opt/xfstests/results//btrfs/276.out.bad) > >>>>> --- tests/btrfs/276.out 2023-07-19 07:24:07.000000000 +0000 > >>>>> +++ /opt/xfstests/results//btrfs/276.out.bad 2023-07-28 > >>>>> 04:15:06.223985372 +0000 > >>>>> @@ -1,16 +1,16 @@ > >>>>> QA output created by 276 > >>>>> wrote 17179869184/17179869184 bytes at offset 0 > >>>>> XXX Bytes, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec) > >>>>> -Number of non-shared extents in the whole file: 131072 > >>>>> +Number of non-shared extents in the whole file: 131082 > >>>>> Create a snapshot of 'SCRATCH_MNT' in 'SCRATCH_MNT/snap' > >>>>> -Number of shared extents in the whole file: 131072 > >>>>> ... > >>>>> (Run 'diff -u /opt/xfstests/tests/btrfs/276.out > >>>>> /opt/xfstests/results//btrfs/276.out.bad' to see the entire diff) > >>>>> > >>>>> [CAUSE] > >>>>> The test case uses golden output to record the number of total extents > >>>>> of a 16G file. > >>>>> > >>>>> This is not reliable as we can have writeback happen halfway, resulting > >>>>> smaller extents thus slightly more extents. > >>>>> > >>>>> With a VM with 4G memory, I have a chance around 1/10 hitting this > >>>>> false alert. > >>>>> > >>>>> [FIX] > >>>>> Instead of using golden output, we allow a slight (5%) float in the > >>>>> number of extents, and move the 131072 (and 131072 - 16) from golden > >>>>> output, so even if we have a slightly more extents, we can still pass > >>>>> the test. > >>>>> > >>>>> Signed-off-by: Qu Wenruo <wqu@xxxxxxxx> > >>>>> --- > >>>>> tests/btrfs/276 | 41 ++++++++++++++++++++++++++++++++++++----- > >>>>> tests/btrfs/276.out | 4 ---- > >>>>> 2 files changed, 36 insertions(+), 9 deletions(-) > >>>>> > >>>>> diff --git a/tests/btrfs/276 b/tests/btrfs/276 > >>>>> index 944b0c8f..a63b28bb 100755 > >>>>> --- a/tests/btrfs/276 > >>>>> +++ b/tests/btrfs/276 > >>>>> @@ -65,10 +65,17 @@ count_not_shared_extents() > >>>>> > >>>>> # Create a 16G file as that results in 131072 extents, all with a > >>>>> size of 128K > >>>>> # (due to compression), and a fs tree with a height of 3 (root node > >>>>> at level 2). > >>>>> +# > >>>>> +# But due to writeback can happen halfway, we may have slightly more > >>>>> extents > >>>>> +# than 128K, so we allow 5% increase in the number of extents. > >>>>> +# > >>>>> # We want to verify later that fiemap correctly reports the > >>>>> sharedness of each > >>>>> # extent, even when it needs to switch from one leaf to the next > >>>>> one and from a > >>>>> # node at level 1 to the next node at level 1. > >>>>> # > >>>>> +nr_extents_lower=$((128 * 1024)) > >>>>> +nr_extents_upper=$((128 * 1024 + 128 * 1024 / 20)) > >>>>> + > >>>>> $XFS_IO_PROG -f -c "pwrite -b 8M 0 16G" $SCRATCH_MNT/foo | > >>>>> _filter_xfs_io > >>>> > >>>> Does adding '-s' (fsync after every write) to the $XFS_IO_PROG fixes > >>>> the issue? > >>>> On my test vm, it doesn't increase runtime by that much (16 to 23 > >>>> seconds). > >>> > >>> This indeed is much better, without dirty pages pressure we can go the > >>> old golden output. > >> > >> Unfortunately I still have a very low chance (~1/20) to hit 1~3 more > >> extent than the golden output. > >> > >> There are still extra things like the commit intervals to let us to > >> writeback halfway. > >> > >> The best situation would be direct IO, but unfortunately direct IO > >> doesn't support compression, thus the resulted file would lead to merged > >> fiemap results. > > > > The compression is not needed. > > I wrote the test to use compression because it makes creating a file with > > thousands of extents very fast and using very little space. > > > > The goal is really to have many file extent items spanning multiple leaves and > > to have a root at level 2 (or higher), to verify the sharedness > > detection is correct > > for subtrees. > > > >> > >> The other solution is to write between two files using direct IO, to > >> make each extent inside the same file not continuous with each other. > >> > >> But that would lead to at least 512M * 2, and we also need to do the > >> same interleaved writes again for the 1M writes. > >> > >> Any ideas would be appreciated. > > > > See if this works: > > > > https://pastebin.com/raw/QnNtSrDP > > This works fine as after 20 runs I still didn't hit any output mismatch. > > Although you can further improve it by using direct IO and much smaller > blocksize (4K), which can further reduce the size and completely > eliminate the possibility of writeback halfway. > (This would require 4K sector size though) The 64K was chosen precisely so that it works with any sector size. I don't know why you worry about the size - it's a ~250M file, quite small. > > In that case, you can even reuse the old 131072 number, as we only need > around 512M (file size would be 1G) for data. > > > Another thing is, if tree level 3 (root node level 2) is needed, we can > even further reduce the file size by requiring 4K nodesize, which can > further speed up the test, without the need for xattr. Yes, but I prefer to keep the default node size, so that it works on any environment and configuration, therefore increasing test coverage. > > > > > It accomplishes the same goals and it's still fast (about 23 seconds > > on my non-debug test vm, before it was about 16 seconds). > > The modified version is already fast enough even with a KASAN enabled > kernel, only 60s compared to the old 120+s. > > Mind to send the modification as a proper fix? That's a much improved > version compared to mine. I will, thanks. > > Thanks, > Qu > > > > > Thanks. > > > >> > >> Thanks, > >> Qu > >>> > >>> Thanks, > >>> Qu > >>>> > >>>> I'd rather do that so that we can be sure fiemap is working correctly > >>>> and not returning more extents than there really are - this approach > >>>> of allowing a bit more allows for that type of bug to be unnoticed, > >>>> plus that little bit more might not be always enough (depending on > >>>> available rm, writeback settings, etc). > >>>> > >>>> Thanks. > >>>> > >>>>> > >>>>> # Sync to flush delalloc and commit the current transaction, so > >>>>> fiemap will see > >>>>> @@ -76,13 +83,22 @@ $XFS_IO_PROG -f -c "pwrite -b 8M 0 16G" > >>>>> $SCRATCH_MNT/foo | _filter_xfs_io > >>>>> sync > >>>>> > >>>>> # All extents should be reported as non shared (131072 extents). > >>>>> -echo "Number of non-shared extents in the whole file: > >>>>> $(count_not_shared_extents)" > >>>>> +found1=$(count_not_shared_extents) > >>>>> +echo "Number of non-shared extents in the whole file: ${found1}" >> > >>>>> $seqres.full > >>>>> + > >>>>> +if [ $found1 -lt $nr_extents_lower -o $found1 -gt $nr_extents_upper > >>>>> ]; then > >>>>> + echo "unexpected initial number of extents, has $found1 > >>>>> expect [$nr_extents_lower, $nr_extents_upper]" > >>>>> +fi > >>>>> > >>>>> # Creating a snapshot. > >>>>> $BTRFS_UTIL_PROG subvolume snapshot $SCRATCH_MNT $SCRATCH_MNT/snap > >>>>> | _filter_scratch > >>>>> > >>>>> # We have a snapshot, so now all extents should be reported as shared. > >>>>> -echo "Number of shared extents in the whole file: > >>>>> $(count_shared_extents)" > >>>>> +found2=$(count_shared_extents) > >>>>> +echo "Number of shared extents in the whole file: ${found2}" >> > >>>>> $seqres.full > >>>>> +if [ $found2 -ne $found1 ]; then > >>>>> + echo "unexpected shared extents, has $found2 expect $found1" > >>>>> +fi > >>>>> > >>>>> # Now COW two file ranges, of 1M each, in the snapshot's file. > >>>>> # So 16 extents should become non-shared after this. > >>>>> @@ -97,8 +113,18 @@ sync > >>>>> > >>>>> # Now we should have 16 non-shared extents and 131056 (131072 - 16) > >>>>> shared > >>>>> # extents. > >>>>> -echo "Number of non-shared extents in the whole file: > >>>>> $(count_not_shared_extents)" > >>>>> -echo "Number of shared extents in the whole file: > >>>>> $(count_shared_extents)" > >>>>> +found3=$(count_not_shared_extents) > >>>>> +found4=$(count_shared_extents) > >>>>> +echo "Number of non-shared extents in the whole file: ${found3}" > >>>>> +echo "Number of shared extents in the whole file: ${found4}" >> > >>>>> $seqres.full > >>>>> + > >>>>> +if [ $found3 != 16 ]; then > >>>>> + echo "Unexpected number of non-shared extents, has $found3 > >>>>> expect 16" > >>>>> +fi > >>>>> + > >>>>> +if [ $found4 != $(( $found1 - $found3 )) ]; then > >>>>> + echo "Unexpected number of shared extents, has $found4 expect > >>>>> $(( $found1 - $found3 ))" > >>>>> +fi > >>>>> > >>>>> # Check that the non-shared extents are indeed in the expected file > >>>>> ranges (each > >>>>> # with 8 extents). > >>>>> @@ -117,7 +143,12 @@ _scratch_remount commit=1 > >>>>> sleep 1.1 > >>>>> > >>>>> # Now all extents should be reported as not shared (131072 extents). > >>>>> -echo "Number of non-shared extents in the whole file: > >>>>> $(count_not_shared_extents)" > >>>>> +found5=$(count_not_shared_extents) > >>>>> +echo "Number of non-shared extents in the whole file: ${found5}" >> > >>>>> $seqres.full > >>>>> + > >>>>> +if [ $found5 != $found1 ]; then > >>>>> + echo "Unexpected final number of non-shared extents, has > >>>>> $found5 expect $found1" > >>>>> +fi > >>>>> > >>>>> # success, all done > >>>>> status=0 > >>>>> diff --git a/tests/btrfs/276.out b/tests/btrfs/276.out > >>>>> index 3bf5a5e6..e318c2e9 100644 > >>>>> --- a/tests/btrfs/276.out > >>>>> +++ b/tests/btrfs/276.out > >>>>> @@ -1,16 +1,12 @@ > >>>>> QA output created by 276 > >>>>> wrote 17179869184/17179869184 bytes at offset 0 > >>>>> XXX Bytes, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec) > >>>>> -Number of non-shared extents in the whole file: 131072 > >>>>> Create a snapshot of 'SCRATCH_MNT' in 'SCRATCH_MNT/snap' > >>>>> -Number of shared extents in the whole file: 131072 > >>>>> wrote 1048576/1048576 bytes at offset 8388608 > >>>>> XXX Bytes, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec) > >>>>> wrote 1048576/1048576 bytes at offset 12884901888 > >>>>> XXX Bytes, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec) > >>>>> Number of non-shared extents in the whole file: 16 > >>>>> -Number of shared extents in the whole file: 131056 > >>>>> Number of non-shared extents in range [8M, 9M): 8 > >>>>> Number of non-shared extents in range [12G, 12G + 1M): 8 > >>>>> Delete subvolume (commit): 'SCRATCH_MNT/snap' > >>>>> -Number of non-shared extents in the whole file: 131072 > >>>>> -- > >>>>> 2.41.0 > >>>>>