On Tue, Oct 17, 2023 at 01:12:08PM -0700, Catherine Hoang wrote: > One of our VM cluster management products needs to snapshot KVM image > files so that they can be restored in case of failure. Snapshotting is > done by redirecting VM disk writes to a sidecar file and using reflink > on the disk image, specifically the FICLONE ioctl as used by > "cp --reflink". Reflink locks the source and destination files while it > operates, which means that reads from the main vm disk image are blocked, > causing the vm to stall. When an image file is heavily fragmented, the > copy process could take several minutes. Some of the vm image files have > 50-100 million extent records, and duplicating that much metadata locks > the file for 30 minutes or more. Having activities suspended for such > a long time in a cluster node could result in node eviction. > > Clone operations and read IO do not change any data in the source file, > so they should be able to run concurrently. Demote the exclusive locks > taken by FICLONE to shared locks to allow reads while cloning. While a > clone is in progress, writes will take the IOLOCK_EXCL, so they block > until the clone completes. Sorry for being pesky, but do you have some rough numbers on how much this actually with the above workload? Otherwise looks good: Reviewed-by: Christoph Hellwig <hch@xxxxxx>