Davide Libenzi wrote: > On Mon, 22 Jun 2009, Gregory Haskins wrote: > > >> Michael S. Tsirkin wrote: >> >>> On Mon, Jun 22, 2009 at 11:51:42AM -0700, Davide Libenzi wrote: >>> >>> >>>> A file* based kernel-to-kernel interface is rather wrong IMO. >>>> >>>> >>> But eventfd_ctx should work fine. >>> >>> >> Yeah, and I guess we can always just say that qemu can't close the fd or >> something. Seems hacky, but it might work if Davide insists we need his >> change. >> > > Continuing here, since there's no reason of having many subthreads talking > about the same thing. > Can you make a detailed example of what you're trying to achieve (no Hint > Mode, please)? > As it sounds to me, that you need a consumer/producer reference counting, > to cover your scenario correctly. > Well, one of them was already briefly mentioned (the PCI-passthrough thing). I am not personally working on this part (yet, anyway). Another example of something I am actually working on as we speak would be for this thing we are building called "virtual-bus". It is a way to build/deploy device models directly in the kernel. In either of these cases, we have this concept of allowing the guest to notify the host, or vice versa, that something happened. Typically this would be in reference to some chunk of shared memory, and the signaling is telling the other side "I changed something, go look". Without going into a ton of detail (unless, of course, you want it) is that we are generalizing the signaling infrastructure (irqfd and iosignalfd) so that something like PCI-passthrough or vbus are not directly coupled to KVM. They communicate to KVM purely in terms of (among other things) these irqfd/iosignalfd interfaces. Using vbus as an example (though others are similar): vbus would primarily exists as a kernel-model. However, there would be a small device model in qemu-kem userspace to publish something like a PCI device that declares its resource requirements to the guest. Some of those requirements would be things like how many interrupts it needs, and what IO ranges it supports, etc. When the guest programs the PCI space, it maps the resources from its own world into the virtual PCI resources emulated in qemu. So up in userspace, the vbus pci-device would have an open reference to the kvm guest (derived from /dev/kvm) and an open reference to a vbus (derived from /dev/vbus). Lets call these kvmfd, and vbusfd, respectively. For something like an interrupt, we would hook the point where the PCI-MSI interrupt is assigned, and would do the following: gsi = kvm_irq_route_gsi(); fd = eventfd(0, 0); ioctl(kvmfd, KVM_IRQFD_ASSIGN, {fd, gsi}); ioctl(vbusfd, VBUS_SHMSIGNAL_ASSIGN, {sigid, fd}); So userspace orchestrated the assignment of this one eventfd to a KVM consumer, and a VBUS producer. The two subsystems do not care about the details of the other side of the link, per se. VBUS just knows that it can eventfd_signal() its memory region to tell whomever is listening that it changed. Likewise, KVM just knows to inject "gsi" when it gets signalled. You could equally have given "fd" to a userspace thread for either producer or consumer roles, or any other combination. If we were doing PCI-passthough, substitute the last SHMSIGNAL_ASSIGN ioctl call with some PCI_PASSTHROUGH_ASSIGN verb and you get the idea. The important thing is that once this is established, userspace doesn't necessarily care about the fd anymore. So now the question is: do we keep it around for other things? Do we keep it around because we don't want KVM to see the POLLHUP, or do we address the "release" code so that it works even if userspace issued close(fd) at this point. I am not sure what the answer is, but this is the scenario we are concerned with in this thread. In the example above, vbus is free to produce events on its eventfd until it gets a SHMSIGNAL_DEASSIGN request. -Greg > > > - Davide > > > -- > To unsubscribe from this list: send the line "unsubscribe kvm" in > the body of a message to majordomo@xxxxxxxxxxxxxxx > More majordomo info at http://vger.kernel.org/majordomo-info.html >
Attachment:
signature.asc
Description: OpenPGP digital signature