On Thu, Oct 12, 2023 at 08:02:31AM -0700, Darrick J. Wong wrote: > Catherine started with this, > https://lore.kernel.org/linux-xfs/8911B94D-DD29-4D6E-B5BC-32EAF1866245@xxxxxxxxxx/ > > and the rest of us whittled it down to the single patch you see here. > Sections 1-2 are still relevant; S3 was the path not taken. I'd still take the core of that into the actual commit message. This part, maybe slightly rewored: "One of our VM cluster management products needs to snapshot KVM image files so that they can be restored in case of failure. Snapshotting is done by redirecting VM disk writes to a sidecar file and using reflink on the disk image, specifically the FICLONE ioctl as used by "cp --reflink". Reflink locks the source and destination files while it operates, which means that reads from the main vm disk image are blocked, causing the vm to stall. When an image file is heavily fragmented, the copy process could take several minutes. Some of the vm image files have 50-100 million extent records, and duplicating that much metadata locks the file for 30 minutes or more. Having activities suspended for such a long time in a cluster node could result in node eviction."