On Thu, 14 Jun 2018 18:57:11 -0700 Siwei Liu <loseweigh@xxxxxxxxx> wrote: > Thank you for sharing your thoughts, Cornelia. With questions below, I > think you raised really good points, some of which I don't have answer > yet and would also like to explore here. > > First off, I don't want to push the discussion to the extreme at this > point, or sell anything about having QEMU manage everything > automatically. Don't get me wrong, it's not there yet. Let's don't > assume we are tied to a specific or concerte solution. I think the key > for our discussion might be to define or refine the boundary between > VM and guest, e.g. what each layer is expected to control and manage > exactly. > > In my view, there might be possibly 3 different options to represent > the failover device conceipt to QEMU and libvirt (or any upper layer > software): > > a. Seperate device: in this model, virtio and passthough remains > separate devices just as today. QEMU exposes the standby feature bit > for virtio, and publish status/event around the negotiation process of > this feature bit for libvirt to react upon. Since Libvirt has the > pairing relationship itself, maybe through MAC address or something > else, it can control the presence of primary by hot plugging or > unplugging the passthrough device, although it has to work tightly > with virtio's feature negotation process. Not just for migration but > also various corner scenarios (driver/feature ok, device reset, > reboot, legacy guest etc) along virtio's feature negotiation. Yes, that one has obvious tie-ins to virtio's modus operandi. > > b. Coupled device: in this model, virtio and passthough devices are > weakly coupled using some group ID, i.e. QEMU match the passthough > device for a standby virtio instance by comparing the group ID value > present behind each device's bridge. Libvirt provides QEMU the group > ID for both type of devices, and only deals with hot plug for > migration, by checking some migration status exposed (e.g. the feature > negotiation status on the virtio device) by QEMU. QEMU manages the > visibility of the primary in guest along virtio's feature negotiation > process. I'm a bit confused here. What, exactly, ties the two devices together? If libvirt already has the knowledge that it should manage the two as a couple, why do we need the group id (or something else for other architectures)? (Maybe I'm simply missing something because I'm not that familiar with pci.) > > c. Fully combined device: in this model, virtio and passthough devices > are viewed as a single VM interface altogther. QEMU not just controls > the visibility of the primary in guest, but can also manage the > exposure of the passthrough for migratability. It can be like that > libvirt supplies the group ID to QEMU. Or libvirt does not even have > to provide group ID for grouping the two devices, if just one single > combined device is exposed by QEMU. In either case, QEMU manages all > aspect of such internal construct, including virtio feature > negotiation, presence of the primary, and live migration. Same question as above. > > It looks like to me that, in your opinion, you seem to prefer go with > (a). While I'm actually okay with either (b) or (c). Do I understand > your point correctly? I'm not yet preferring anything, as I'm still trying to understand how this works :) I hope we can arrive at a model that covers the use case and that is also flexible enough to be extended to other platforms. > > The reason that I feel that (a) might not be ideal, just as Michael > alluded to (quoting below), is that as management stack, it really > doesn't need to care about the detailed process of feature negotiation > (if we view the guest presence of the primary as part of feature > negotiation at an extended level not just virtio). All it needs to be > done is to hand in the required devices to QEMU and that's all. Why do > we need to addd various hooks, events for whichever happens internally > within the guest? > > '' > Primary device is added with a special "primary-failover" flag. > A virtual machine is then initialized with just a standby virtio > device. Primary is not yet added. > > Later QEMU detects that guest driver device set DRIVER_OK. > It then exposes the primary device to the guest, and triggers > a device addition event (hot-plug event) for it. > > If QEMU detects guest driver removal, it initiates a hot-unplug sequence > to remove the primary driver. In particular, if QEMU detects guest > re-initialization (e.g. by detecting guest reset) it immediately removes > the primary device. > '' > > and, > > '' > management just wants to give the primary to guest and later take it back, > it really does not care about the details of the process, > so I don't see what does pushing it up the stack buy you. > > So I don't think it *needs* to be done in libvirt. It probably can if you > add a bunch of hooks so it knows whenever vm reboots, driver binds and > unbinds from device, and can check that backup flag was set. > If you are pushing for a setup like that please get a buy-in > from libvirt maintainers or better write a patch. > '' This actually seems to mean the opposite to me: We need to know what the guest is doing and when, as it directly drives what we need to do with the devices. If we switch to a visibility vs a hotplug model (see the other mail), we might be able to handle that part within qemu. However, I don't see how you get around needing libvirt to actually set this up in the first place and to handle migration per se. _______________________________________________ Virtualization mailing list Virtualization@xxxxxxxxxxxxxxxxxxxxxxxxxx https://lists.linuxfoundation.org/mailman/listinfo/virtualization