On Thu, Aug 3, 2023 at 11:56 AM Qu Wenruo <quwenruo.btrfs@xxxxxxx> wrote: > > > > On 2023/8/3 18:25, fdmanana@xxxxxxxxxx wrote: > > From: Filipe Manana <fdmanana@xxxxxxxx> > > > > btrfs/276 creates a 16G file with compression enabled in order to quickly > > and efficiently create a file with many extents and have a fs tree with a > > height of 3 (root node at level 2), so that it can test that fiemap is > > correctly reporting extent sharedness when we have shared subtrees of the > > fs tree due to a snapshot. > > > > Compression results in extents with a maximum size of 128K and the test > > is expecting only extents of 128K, which normally happens if the machine > > has a large amount of RAM and writeback is not triggered before the xfs_io > > command finishes. However if writeback is triggered in the meanwhile, due > > to memory pressure for example, then we can get end up with some extents > > that are smaller than 128K, therefore increasing the total number of > > extents in the test file and make the test fail. > > > > This seems to happen often on test machines with small amounts of RAM, > > such as 4G, as reported by Qu in the following thread: > > > > https://lore.kernel.org/linux-btrfs/20230801065529.50122-1-wqu@xxxxxxxx/ > > > > So to address this create a file with holes and direct IO to make sure we > > always get a specific number of extents in the test file. To speedup the > > test create 2000 64K extents, with holes in between them, so that it works > > on a fs with any sector size, and then create a bunch of files with large > > xattrs to quickly bump the fs tree height to 3 for any node size (4K to > > 64K). This also guarantees that the file extent items are spread over > > multiples leaves, in order to exercise fiemap's correctness when reporting > > shared extents due to shared subtrees. > > > > Reported-by: Qu Wenruo <wqu@xxxxxxxx> > > Signed-off-by: Filipe Manana <fdmanana@xxxxxxxx> > > --- > > tests/btrfs/276 | 85 ++++++++++++++++++++++++++------------------- > > tests/btrfs/276.out | 20 +++++------ > > 2 files changed, 59 insertions(+), 46 deletions(-) > > > > diff --git a/tests/btrfs/276 b/tests/btrfs/276 > > index 944b0c8f..2b5543ad 100755 > > --- a/tests/btrfs/276 > > +++ b/tests/btrfs/276 > > @@ -9,19 +9,19 @@ > > # and when the file's subvolume was snapshoted. > > # > > . ./common/preamble > > -_begin_fstest auto snapshot compress fiemap > > +_begin_fstest auto snapshot fiemap > > > > . ./common/filter > > +. ./common/attr > > > > _supported_fs btrfs > > _require_scratch > > _require_xfs_io_command "fiemap" "ranged" > > +_require_attrs > > +_require_odirect > > > > _scratch_mkfs >> $seqres.full 2>&1 > > -# We use compression because it's a very quick way to create a file with a very > > -# large number of extents (compression limits the maximum extent size to 128K) > > -# and while using very little disk space. > > -_scratch_mount -o compress > > +_scratch_mount > > > > fiemap_test_file() > > { > > @@ -29,8 +29,9 @@ fiemap_test_file() > > local len=$2 > > > > # Skip the first two lines of xfs_io's fiemap output (file path and > > - # header describing the output columns). > > - $XFS_IO_PROG -c "fiemap -v $offset $len" $SCRATCH_MNT/foo | tail -n +3 > > + # header describing the output columns) as well as holes. > > + $XFS_IO_PROG -c "fiemap -v $offset $len" $SCRATCH_MNT/foo | \ > > + grep -v 'hole' | tail -n +3 > > } > > > > # Count the number of shared extents for the whole test file or just for a given > > @@ -63,19 +64,38 @@ count_not_shared_extents() > > --source 'END { print cnt }' > > } > > > > -# Create a 16G file as that results in 131072 extents, all with a size of 128K > > -# (due to compression), and a fs tree with a height of 3 (root node at level 2). > > -# We want to verify later that fiemap correctly reports the sharedness of each > > -# extent, even when it needs to switch from one leaf to the next one and from a > > -# node at level 1 to the next node at level 1. > > -# > > -$XFS_IO_PROG -f -c "pwrite -b 8M 0 16G" $SCRATCH_MNT/foo | _filter_xfs_io > > - > > -# Sync to flush delalloc and commit the current transaction, so fiemap will see > > -# all extents in the fs tree and extent trees and not look at delalloc. > > -sync > > - > > -# All extents should be reported as non shared (131072 extents). > > +# Create a file with 2000 extents, and a fs tree with a height of at least 3 > > +# (root node at level 2). We want to verify later that fiemap correctly reports > > +# the sharedness of each extent, even when it needs to switch from one leaf to > > +# the next one and from a node at level 1 to the next node at level 1. > > +# To speedup creating a fs tree of height >= 3, add several large xattrs. > > +ext_size=$(( 64 * 1024 )) > > +file_size=$(( 2000 * 2 * $ext_size )) # about 250M > > +nr_cpus=$("$here/src/feature" -o) > > +workers=0 > > +for (( i = 0; i < $file_size; i += 2 * $ext_size )); do > > + $XFS_IO_PROG -f -d -c "pwrite -b $ext_size $i $ext_size" \ > > + $SCRATCH_MNT/foo > /dev/null & > > + workers=$(( workers + 1 )) > > + if [ "$workers" -ge "$nr_cpus" ]; then > > + workers=0 > > + wait > > + fi > > +done > > + > > +workers=0 > > +xattr_value=$(printf '%0.sX' $(seq 1 3900)) > > +for (( i = 1; i <= 29000; i++ )); do > > + echo -n > $SCRATCH_MNT/filler_$i > > + $SETFATTR_PROG -n 'user.x1' -v $xattr_value $SCRATCH_MNT/filler_$i & > > + workers=$(( workers + 1 )) > > + if [ "$workers" -ge "$nr_cpus" ]; then > > + workers=0 > > + wait > > + fi > > +done > > + > > +# All extents should be reported as non shared (2000 extents). > > echo "Number of non-shared extents in the whole file: $(count_not_shared_extents)" > > > > # Creating a snapshot. > > @@ -84,26 +104,21 @@ $BTRFS_UTIL_PROG subvolume snapshot $SCRATCH_MNT $SCRATCH_MNT/snap | _filter_scr > > # We have a snapshot, so now all extents should be reported as shared. > > echo "Number of shared extents in the whole file: $(count_shared_extents)" > > > > -# Now COW two file ranges, of 1M each, in the snapshot's file. > > -# So 16 extents should become non-shared after this. > > +# Now COW two file ranges, of 64K each, in the snapshot's file. > > +# So 2 extents should become non-shared after this. Each file extent item is in > > +# different leaf of the snapshot tree. > > # > > -$XFS_IO_PROG -c "pwrite -b 1M 8M 1M" \ > > - -c "pwrite -b 1M 12G 1M" \ > > +$XFS_IO_PROG -d -c "pwrite -b $ext_size 512K $ext_size" \ > > + -d -c "pwrite -b $ext_size 249M $ext_size" \ > > $SCRATCH_MNT/snap/foo | _filter_xfs_io > > > > -# Sync to flush delalloc and commit the current transaction, so fiemap will see > > -# all extents in the fs tree and extent trees and not look at delalloc. > > -sync > > - > > -# Now we should have 16 non-shared extents and 131056 (131072 - 16) shared > > -# extents. > > +# Now we should have 2 non-shared extents and 1998 shared extents. > > echo "Number of non-shared extents in the whole file: $(count_not_shared_extents)" > > Unfortunately I have hit 3 failure with the same failure with the output > mismatch. > > btrfs/276 52s ... - output mismatch (see > /home/adam/xfstests/results//btrfs/276.out.bad) > --- tests/btrfs/276.out 2023-08-03 18:39:19.456666658 +0800 > +++ /home/adam/xfstests/results//btrfs/276.out.bad 2023-08-03 > 18:41:38.849999998 +0800 > @@ -6,7 +6,7 @@ > XXX Bytes, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec) > wrote 65536/65536 bytes at offset 261095424 > XXX Bytes, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec) > -Number of non-shared extents in the whole file: 2 > +Number of non-shared extents in the whole file: 1 > Number of shared extents in the whole file: 1998 > Number of non-shared extents in range [512K, 512K + 64K): 1 > > So far the hit ratio is pretty high, 3/5. > > Furthermore, when the failure hits, it only shows 1 non-shared extents, > but non-shared extents for the two ranges are both 1. > > Thus I added a sync after the "$XFS_IO_PROG -d -c -d -c" line, and no > more reproduce for 16 runs. Ah yes, I removed it and didn't hit it because it's a fast VM. See if the v2 works out. Thanks. > > This looks like a race between the ordered extent finishing code to > insert the extent mapping and the fiemap call, as direct IO only returns > when the data is written, not after the metadata is updated. > (Looks like KASAN slows down the process enough to hit the race) > > > Otherwise the patch looks pretty good, not only solving the problem but > also speed up the test case itself quite a lot. > > Thanks, > Qu > > > echo "Number of shared extents in the whole file: $(count_shared_extents)" > > > > -# Check that the non-shared extents are indeed in the expected file ranges (each > > -# with 8 extents). > > -echo "Number of non-shared extents in range [8M, 9M): $(count_not_shared_extents 8M 1M)" > > -echo "Number of non-shared extents in range [12G, 12G + 1M): $(count_not_shared_extents 12G 1M)" > > +# Check that the non-shared extents are indeed in the expected file ranges. > > +echo "Number of non-shared extents in range [512K, 512K + 64K): $(count_not_shared_extents 512K 64K)" > > +echo "Number of non-shared extents in range [249M, 249M + 64K): $(count_not_shared_extents 249M 64K)" > > > > # Now delete the snapshot. > > $BTRFS_UTIL_PROG subvolume delete -c $SCRATCH_MNT/snap | _filter_scratch > > @@ -116,7 +131,7 @@ $BTRFS_UTIL_PROG subvolume delete -c $SCRATCH_MNT/snap | _filter_scratch > > _scratch_remount commit=1 > > sleep 1.1 > > > > -# Now all extents should be reported as not shared (131072 extents). > > +# Now all extents should be reported as not shared (2000 extents). > > echo "Number of non-shared extents in the whole file: $(count_not_shared_extents)" > > > > # success, all done > > diff --git a/tests/btrfs/276.out b/tests/btrfs/276.out > > index 3bf5a5e6..197d8edc 100644 > > --- a/tests/btrfs/276.out > > +++ b/tests/btrfs/276.out > > @@ -1,16 +1,14 @@ > > QA output created by 276 > > -wrote 17179869184/17179869184 bytes at offset 0 > > -XXX Bytes, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec) > > -Number of non-shared extents in the whole file: 131072 > > +Number of non-shared extents in the whole file: 2000 > > Create a snapshot of 'SCRATCH_MNT' in 'SCRATCH_MNT/snap' > > -Number of shared extents in the whole file: 131072 > > -wrote 1048576/1048576 bytes at offset 8388608 > > +Number of shared extents in the whole file: 2000 > > +wrote 65536/65536 bytes at offset 524288 > > XXX Bytes, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec) > > -wrote 1048576/1048576 bytes at offset 12884901888 > > +wrote 65536/65536 bytes at offset 261095424 > > XXX Bytes, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec) > > -Number of non-shared extents in the whole file: 16 > > -Number of shared extents in the whole file: 131056 > > -Number of non-shared extents in range [8M, 9M): 8 > > -Number of non-shared extents in range [12G, 12G + 1M): 8 > > +Number of non-shared extents in the whole file: 2 > > +Number of shared extents in the whole file: 1998 > > +Number of non-shared extents in range [512K, 512K + 64K): 1 > > +Number of non-shared extents in range [249M, 249M + 64K): 1 > > Delete subvolume (commit): 'SCRATCH_MNT/snap' > > -Number of non-shared extents in the whole file: 131072 > > +Number of non-shared extents in the whole file: 2000