Hi Darrick, Thank you for the response. I have replied inline. -Suyash Le mar. 13 déc. 2022 à 09:18, Darrick J. Wong <djwong@xxxxxxxxxx> a écrit : > > [ugh, your email never made it to the list. I bet the email security > standards have been tightened again. <insert rant about dkim and dmarc > silent failures here>] :( > > On Sat, Dec 10, 2022 at 09:28:36PM -0800, Suyash Mahar wrote: > > Hi all! > > > > While using XFS's ioctl(FICLONE), we found that XFS seems to have > > poor performance (ioctl takes milliseconds for sparse files) and the > > overhead > > increases with every call. > > > > For the demo, we are using an Optane DC-PMM configured as a > > block device (fsdax) and running XFS (Linux v5.18.13). > > How are you using fsdax and reflink on a 5.18 kernel? That combination > of features wasn't supported until 6.0, and the data corruption problems > won't get fixed until a pull request that's about to happen for 6.2. We did not enable the dax option. The optane DIMMs are configured to appear as a block device. $ mount | grep xfs /dev/pmem0p4 on /mnt/pmem0p4 type xfs (rw,relatime,attr2,inode64,logbufs=8,logbsize=32k,noquota) Regardless of the block device (the plot includes results for optane and RamFS), it seems like the ioctl(FICLONE) call is slow. > > We create a 1 GiB dense file, then repeatedly modify a tiny random > > fraction of it and make a clone via ioctl(FICLONE). > > Yay, random cow writes, that will slowly increase the number of space > mapping records in the file metadata. > > > The time required for the ioctl() calls increases from large to insane > > over the course of ~250 iterations: From roughly a millisecond for the > > first iteration or two (which seems high, given that this is on > > Optane and the code doesn't fsync or msync anywhere at all, ever) to 20 > > milliseconds (which seems crazy). > > Does the system call runtime increase with O(number_extents)? You might > record the number of extents in the file you're cloning by running this > periodically: > > xfs_io -c stat $path | grep fsxattr.nextents The extent count does increase linearly (just like the ioctl() call latency). I used the xfs_bmap tool, let me know if this is not the right way. If it is not, I'll update the microbenchmark to run xfs_io. > FICLONE (at least on XFS) persists dirty pagecache data to disk, and > then duplicates all written-space mapping records from the source file to > the destination file. It skips preallocated mappings created with > fallocate. > > So yes, the plot is exactly what I was expecting. > > --D > > > The plot is attached to this email. > > > > A cursory look at the extent map suggests that it gets increasingly > > complicated resulting in the complexity. > > > > The enclosed tarball contains our code, our results, and some other info > > like a flame graph that might shed light on where the ioctl is spending > > its time. > > > > - Suyash & Terence