Anthony Liguori <anthony@xxxxxxxxxxxxx> wrote: > On 11/09/2011 11:02 AM, Avi Kivity wrote: >> On 11/09/2011 06:39 PM, Anthony Liguori wrote: >>> >>> Migration with qcow2 is not a supported feature for 1.0. Migration is >>> only supported with raw images using coherent shared storage[1]. >>> >>> [1] NFS is only coherent with close-to-open which right now is not >>> good enough for migration. >> >> Say what? > > Due to block format probing, we read at least the first sector of the > disk during start up. > > Strictly going by what NFS guarantees, since we don't open on the > destination *after* as close on the source, we aren't guaranteed to > see what's written by the source. > > In practice, because of block format probing, unless we're using > cache=none, the first sector can be out of sync with the source on the > destination. If you use cache=none on a Linux client with at least a > Linux NFS server, you should be relatively safe. You are not :-( If you are using a format that "caches" data, like qcow2 with the L1/L2 cache, you are not safe. You need to reopen (or discard metadata + re-read it). Notice that raw nowadays also has metadata (we can resize the image on the flight, and we need to reopen to find that). About the coherence problem, I just sent the patches that we had on RHEL to the list. With cache=none, both NFS & iSCSI & Fiberchannel are ok (module the previous problem of metadata). If you look at the second patch that I sent, it "tries" to flush the read cache for a block device. Problem with the patch are: - BLKFLSBUF is linux specific - BLKFLSBUF only works for "some block devices" - Christoph just Nacked it due to previous reasons. In resume: - If we use raw, we don't resize images, and we use a clustered filesystem, qemu.git migration works. - If we change metadata (qcow2, raw resize, ...) we need to re-read metadata (we just close +open on RHEL). - If we use NFS: we need to use cache=none, or close+open consistency - if we use iSCSI: we need to use cache=none. close+open is not enough for consistency. The ioctl patch that I sent happens to work on linux, but it is not even guaranteed to work there. And if our block layer gurus told us not to use the ioctl() I think that we need to do just that. Later, Juan. -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html