Cc: linux-xfs On Wed 19 Aug 2020 07:53:00 PM CEST, Brian Foster wrote: > In any event, if you're seeing unclear or unexpected performance > deltas between certain XFS configurations or other fs', I think the > best thing to do is post a more complete description of the workload, > filesystem/storage setup, and test results to the linux-xfs mailing > list (feel free to cc me as well). As it is, aside from the questions > above, it's not really clear to me what the storage stack looks like > for this test, if/how qcow2 is involved, what the various > 'preallocation=' modes actually mean, etc. (see [1] for a bit of context) I repeated the tests with a larger (125GB) filesystem. Things are a bit faster but not radically different, here are the new numbers: |----------------------+-------+-------| | preallocation mode | xfs | ext4 | |----------------------+-------+-------| | off | 8139 | 11688 | | off (w/o ZERO_RANGE) | 2965 | 2780 | | metadata | 7768 | 9132 | | falloc | 7742 | 13108 | | full | 41389 | 16351 | |----------------------+-------+-------| The numbers are I/O operations per second as reported by fio, running inside a VM. The VM is running Debian 9.7 with Linux 4.9.130 and the fio version is 2.16-1. I'm using QEMU 5.1.0. fio is sending random 4KB write requests to a 25GB virtual drive, this is the full command line: fio --filename=/dev/vdb --direct=1 --randrepeat=1 --eta=always --ioengine=libaio --iodepth=32 --numjobs=1 --name=test --size=25G --io_limit=25G --ramp_time=5 --rw=randwrite --bs=4k --runtime=60 The virtual drive (/dev/vdb) is a freshly created qcow2 file stored on the host (on an xfs or ext4 filesystem as the table above shows), and it is attached to QEMU using a virtio-blk-pci device: -drive if=virtio,file=image.qcow2,cache=none,l2-cache-size=200M cache=none means that the image is opened with O_DIRECT and l2-cache-size is large enough so QEMU is able to cache all the relevant qcow2 metadata in memory. The host is running Linux 4.19.132 and has an SSD drive. About the preallocation modes: a qcow2 file is divided into clusters of the same size (64KB in this case). That is the minimum unit of allocation, so when writing 4KB to an unallocated cluster QEMU needs to fill the other 60KB with zeroes. So here's what happens with the different modes: 1) off: for every write request QEMU initializes the cluster (64KB) with fallocate(ZERO_RANGE) and then writes the 4KB of data. 2) off w/o ZERO_RANGE: QEMU writes the 4KB of data and fills the rest of the cluster with zeroes. 3) metadata: all clusters were allocated when the image was created but they are sparse, QEMU only writes the 4KB of data. 4) falloc: all clusters were allocated with fallocate() when the image was created, QEMU only writes 4KB of data. 5) full: all clusters were allocated by writing zeroes to all of them when the image was created, QEMU only writes 4KB of data. As I said in a previous message I'm not familiar with xfs, but the parts that I don't understand are - Why is (4) slower than (1)? - Why is (5) so much faster than everything else? I hope I didn't forget anything, tell me if you have questions. Berto [1] https://lists.gnu.org/archive/html/qemu-block/2020-08/msg00481.html