Hi , Since 3687fcb0752a ("btrfs: zoned: make auto-reclaim less aggressive") commit, reclaim algorithm has been changed to trigger auto-reclaim once the fs used size is more than a certain threshold. This change breaks 237 test. I tried to adapt the test by doing the following: - Write a small file first - Write a big file that increases the disk usage to be more than the reclaim threshold - Delete the big file to trigger threshold - Ensure the small file is relocated and the space used by the big file is reclaimed. My test case works properly for small ZNS drives but not for bigger sized drives in QEMU. When I use a drive with a size of 100G, not all zones that were used by the big file are correctly reclaimed. Either I am not setting up the test correctly or there is something wrong on how reclaim works for zoned devices. I created a simple script to reproduce the scenario instead of running the test. Please adapt the $DEV and $big_file_size based on the drive size. As I am setting the bg_reclaim_threshold to be 51, $big_file_size should be at least 51% of the drive size. ``` DEV=nvme0n3 DEV_PATH=/dev/$DEV big_file_size=2500M echo "mq-deadline" > /sys/block/$DEV/queue/scheduler umount /mnt/scratch blkzone reset $DEV_PATH mkfs.btrfs -f -d single -m single $DEV_PATH > /dev/null; mount -t btrfs $DEV_PATH \ /mnt/scratch uuid=$(btrfs fi show $DEV_PATH | grep 'uuid' | awk '{print $NF}') echo "51" > /sys/fs/btrfs/$uuid/bg_reclaim_threshold fio --filename=/mnt/scratch/test2 --size=1M --rw=write --bs=4k \ --name=btrfs_zoned > /dev/null btrfs fi sync /mnt/scratch echo "Open zones before big file trasfer:" blkzone report $DEV_PATH | grep -v -e em -e nw | wc -l fio --filename=/mnt/scratch/test1 --size=$big_file_size --rw=write --bs=4k \ --ioengine=io_uring --name=btrfs_zoned > /dev/null btrfs fi sync /mnt/scratch echo "Open zones before removing the file:" blkzone report $DEV_PATH | grep -v -e em -e nw | wc -l rm /mnt/scratch/test1 btrfs fi sync /mnt/scratch echo "Going to sleep. Removed the file" sleep 30 echo "Open zones after reclaim:" blkzone report $DEV_PATH | grep -v -e em -e nw | wc -l ``` I am getting the following output in QEMU: - 5GB ZNS drive with 128MB zone size (and cap) and it is working as expected: ``` Open zones before big file trasfer: 4 Open zones before removing the file: 23 Going to sleep. Removed the file Open zones after reclaim: 4 ``` - 100GB ZNS drive with 128MB zone size (and cap) and it is **not working** as expected: ``` Open zones before big file trasfer: 4 Open zones before removing the file: 455 Going to sleep. Removed the file Open zones after reclaim: 411 ``` Only partial reclaim is happening for bigger sized drives. The issue with that is, if I do another FIO transfer, the drive spits out ENOSPC before its actual capacity is reached as most of the zones have not been reclaimed back and are basically in an unusable state. Is there a limit on how many bgs can be reclaimed? Let me know if I am doing something wrong in the test or if it is an actual issue. Pankaj Raghav (1): btrfs/237: adapt the test to work with the new reclaim algorithm tests/btrfs/237 | 80 +++++++++++++++++++++++++++++++++++-------------- 1 file changed, 57 insertions(+), 23 deletions(-) -- 2.25.1