On 03/15, Peter Krempa wrote: > Sorry for my late response ... > > On Mon, Feb 22, 2016 at 22:51:55 +0100, Simon wrote: > > On 02/16, Peter Krempa wrote: > > > On Fri, Feb 12, 2016 at 13:22:01 -0700, Eric Blake wrote: > > > > On 12/02/2015 02:08 AM, Simon Kollberg wrote: [...] I think most of what has been discussed so far is related to the solution further down in one way or another. So I cut down the thread to make it a bit lighter. > > > > I get why using <disk> for something that is not going to be visible to > > the guest is a bad idea. On the other hand, in this case the disk that > > is not going to be attached to the guest is actually the normal disk > > (labeled "secondary disk" on the block replication wiki). I don't think > > it makes sense to make that into something that is not <disk>. > > It does. The <devices> portion contains stuff that is guest visible. The > fact that qemu calls it a disk is a qemu implementation detail. For our > use it's not guest visible and we are not copying the design of qemu in > cases where it does not make sense. > Just to make sure that we're on the same page. In the secondary VM, when block replication is enabled, the disk that is actually written to by the guest is the block replication 'active disk' and not the main disk. That is why I'm hesitent to completely cut the block replication disks out from the disk XML and make it into a entierly different entity. I do see your point though, since the block replication disks are supposed to be transparent from the guest's point of view. > > I think many issues that has been brought up would be solved by a new > > disk definition where the replication and the normal disk is combined. > > But again, I don't think that the quorum and the block replication > > should be defined the same way since block replication acts more as a > > chain while the quorum as an array. > > > > What do you guys think about this? > > > > <disk type='file' device='disk'> > > <driver name='qemu' type='raw'/> > > <source file='/opt/libvirt/1.raw'/> > > <target dev='vda' bus='virtio'/> > > > > <replication type='file' device='disk' mode='secondary'> > > <driver name='qemu' type='qcow2'/> > > <source file='/opt/libvirt/active_disk.qcow2'/> > > <backingStore type='file'> > > <format type='qcow2'/> > > <source file='/opt/libvirt/hidden_disk.qcow2'/> > > </backingStore> > > </replication> > > This actually looks more reasonable although I have some comments: > > - device='disk' doesn't make any sense here, this is not guest visible Same response as above. I could agree that it might be redundant to keep device='disk' in both the <disk> and the <replication> element if the <replication> element is nested under the <disk>. But the fact is, it's the replication disk that is being attached as a device, not the normal disk (this only get writes from the primary through nbd). See: http://wiki.qemu.org/Features/BlockReplication > - the <driver> element corresponds to the format of the image. I think > that we should actually call it <format> to avoid confusion. Yea this is a bit confusing, same goes for normal disks. At some places it's called format and in other places it's called driver. Not sure what the reason is behind this. In qemu they started to replace format with driver: http://git.qemu.org/?p=qemu.git;a=commit;h=74fe54f2a1b5c4f4498a8fe521e1dfc936656cd4 Keeping it similar to the disk schema would also let us re-use some XML definitions that already exists. > > > > > </disk> > > > > This would solve all three issues: referencing, replication driver + > > file driver in the same definition and a single target. One downside is > > that the <replication> element is going to be more or less a <disk> > > clone. > > > > Another option would be to allow for nested <disk> elements and add a > > bogus replication target. > > > > <disk type='file' device='disk'> > > <driver name='qemu' type='raw'/> > > <source file='/opt/libvirt/1.raw'/> > > > > <target dev='replication' bus='virtio'/> > > > > <disk type='file' device='disk' mode='secondary'> > > <driver name='qemu' type='qcow2'/> > > <source file='/opt/libvirt/active_disk.qcow2'/> > > <target dev='vda' bus='virtio'/> > > <backingStore type='file'> > > <format type='qcow2'/> > > <source file='/opt/libvirt/hidden_disk.qcow2'/> > > </backingStore> > > </disk> > > </disk> > > No, this doesn't make any sense and looks hackish. > Alright, let's discard this one. I prefere option 1 as well. Just unfortunate that two so similar things need two different XML definitions. In one way it's good to differentiate between the two, but at the same time it's nice to not having to repeat a whole lot. > > > > > Additionally, looking through the docs for block replication posted on > > > qemu devel require to use unstable commands such as "x-blockdev-change" > > > which, until it's stable is not an option in libvirt. > > > > > > > I agree that it is still way to early to push this stuff upstream. But I > > think it's a good thing to start the discussion and experiment with new > > features before they are marked as stable and the x-tag is removed. That > > way libvirt is keeping up to date and can also impact the current work > > done in qemu by doing early testing. > > In libvirt we don't have any prior approach to have experimental > features, thus we can't really use anything experimental. By using it we > commited to support it. If you want to add anything experimental we will > need to formulate a process to add a feature that may vanish later on. > Yea, I'm not proposing that anyone should start pushing experimental stuff upstream. What I'm saying is that it could be benifitial to start testing out the experimental features before they are added as stable, for reasons stated earler. > > > > > Also for all this to be useful, the rest of the COLO stuff needs to be > > > merged too, otherwise the block replication code won't be used at all. > > > > > > > From what I've gathered, block replication is supposed to be an > > individual feature, separate from COLO. Even so, I see your point about > > not including something that is not being used at the moment. > > Okay. Fair enough. In that case though, how does that differ from a > blockdev-mirror job? Some of the qemu devs could probably give you a better answer to this. But I assume the two serve different purposes. I don't think they would spend time creating something that already existed. >From what I can tell, mirroring/replicating writes from the primary to the secondary guest is only one part of the block replication functionality. It also takes care of buffering writes done by the secondary until they synchronize etc. -- libvir-list mailing list libvir-list@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/libvir-list