On Thu, Apr 21, 2022 at 04:55:53PM +0200, Claudio Fontana wrote: > a ping on this one; > > this change makes sense to me, and makes performance better in specific cases, by avoiding --bypass-cache before reaching the "cache trashing" state. > > However, before virsh save was guarenteed to go through iohelper, now it is not. > > The iohelper was guaranteeing a virFileDataSync before exiting, > while after the change the data will still be in-flight between > the kernel and the device as virsh save returns to the prompt. This is a correctness problem I'm afraid, notably with network filesystems commit f32e3a2dd686f3692cd2bd3147c03e90f82df987 Author: Michal Prívozník <mprivozn@xxxxxxxxxx> Date: Tue Oct 30 19:15:48 2012 +0100 iohelper: fdatasync() at the end Currently, when we are doing (managed) save, we insert the iohelper between the qemu and OS. The pipe is created, the writing end is passed to qemu and the reading end to the iohelper. It reads data and write them into given file. However, with write() being asynchronous data may still be in OS caches and hence in some (corner) cases, all migration data may have been read and written (not physically though). So qemu will report success, as well as iohelper. However, with some non local filesystems, where ENOSPACE is polled every X time units, we may get into situation where all operations succeeded but data hasn't reached the disk. And in fact will never do. Therefore we ought sync caches to make sure data has reached the block device on remote host. It is also a crash-safety issue. ie if the host OS crashes after we've created the save image, the image will still exist on disk but can have lost arbitrary amounts of data, in a way they might not be detectable until the restored VM randomly crashes hours/days later. With regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|