On Sun 23 Aug 2020 11:59:07 PM CEST, Dave Chinner wrote: >> >> Option 4 is described above as initial file preallocation whereas >> >> option 1 is per 64k cluster prealloc. Prealloc mode mixup aside, Berto >> >> is reporting that the initial file preallocation mode is slower than >> >> the per cluster prealloc mode. Berto, am I following that right? >> >> After looking more closely at the data I can see that there is a peak of >> ~30K IOPS during the first 5 or 6 seconds and then it suddenly drops to >> ~7K for the rest of the test. > > How big is the filesystem, how big is the log? (xfs_info output, > please!) The size of the filesystem is 126GB and here's the output of xfs_info: meta-data=/dev/vg/test isize=512 agcount=4, agsize=8248576 blks = sectsz=512 attr=2, projid32bit=1 = crc=1 finobt=1, sparse=1, rmapbt=0 = reflink=0 data = bsize=4096 blocks=32994304, imaxpct=25 = sunit=0 swidth=0 blks naming =version 2 bsize=4096 ascii-ci=0, ftype=1 log =internal log bsize=4096 blocks=16110, version=2 = sectsz=512 sunit=0 blks, lazy-count=1 realtime =none extsz=4096 blocks=0, rtextents=0 >> I was running fio with --ramp_time=5 which ignores the first 5 seconds >> of data in order to let performance settle, but if I remove that I can >> see the effect more clearly. I can observe it with raw files (in 'off' >> and 'prealloc' modes) and qcow2 files in 'prealloc' mode. With qcow2 and >> preallocation=off the performance is stable during the whole test. > > What does "preallocation=off" mean again? Is that using > fallocate(ZERO_RANGE) prior to the data write rather than > preallocating the metadata/entire file? Exactly, it means that. One fallocate() call before each data write (unless the area has been allocated by a previous write). > If so, I would expect the limiting factor is the rate at which IO can > be issued because of the fallocate() triggered pipeline bubbles. That > leaves idle device time so you're not pushing the limits of the > hardware and hence none of the behaviours above will be evident... The thing is that with raw (i.e. non-qcow2) images the number of IOPS is similar, but in that case there are no fallocate() calls, only the data writes. Berto