Davide Libenzi wrote: > On Mon, 22 Jun 2009, Gregory Haskins wrote: > > >> I am probably confused or perhaps have the wrong terminology, but isnt >> that "ok". I am concerned about the consumer (the guy getting the >> POLLINs) to be able to detect POLLHUP when the last producer >> (f_ops->write() from userspace, eventfd_signal() from kernel) goes away. >> >> Consider the following sequence: >> >> ------------------- >> >> userspace calls "fd = eventfd()", and gives one to KVM as an irqfd, and >> the other to some PCI-passthrough device. >> >> The kvm/irqfd side acquires a kref, the pci side acquires a file. At >> this moment, userspace has the fd, and the pci device has the file (for >> eventfd_signal()). The fget() count is 2. Userspace closes the fd >> because its done with it, and the count drops to 1. >> >> Some time later, pci does an fput(), and KVM sees the POLLHUP and cleans up. >> >> ------------------- >> >> In this new model, the POLLHUP would have gone out as soon as userspace >> closed the fd, even though the intended producer (the PCI device) and >> the consumer (the KVM guest) are still up and running. This doesnt seem >> right to me. Or am I missing something? >> > > What you're doing there, is setting up a kernel-to-kernel (since > userspace only role is to create the eventfd) communication, using a file* > as accessory. That IMO is plain wrong. > If userspace is either the producer, or the consumer, and you need to > handle userspace leaving the building, you need to: > > file = eventfd_fget(fd); > ctx = eventfd_ctx_get(file); /* Eventually, if producer */ > eventfd_pollcb_register(file, ...); > fput(file); > > In your case of kernel-to-kernel scenario, why would you need eventfd at > all, if userspace role in that model is simply to create it? > The general thesis is for decoupling of the two subsystems. In order to do this, you need some form of polymorphism and an intermediate "handle" mechanism which is userspace friendly. File-descriptors already fit this role neatly, with the "int fd" being the handle, and the f_ops being the polymorphic interface. Eventfd is of course, a subclass of this concept in that it has these same general properties but with signaling semantics (non-blocking collapsible events, etc). Say, for example, you wanted disk IO completion events to generate an interrupt into a guest. One way to do this would, of course, modify all the disk-io code so it knows how to directly inject a KVM guest interrupt. While this would work, someone would undoubtedly get flamed for such a suggestion ;) Another way to do it is to treat the AIO eventfd as the hook point. IIUC AIO already knows how to be an eventfd producer. KVM, by virtue of irqfd, already knows how to be an eventfd consumer. So now kvm can consume AIO, or it can consume userspace events equally well, and without modification. Neither side needs to know about the other per se, other than the details on how to use the eventfd interface. Don't get me wrong: We expect userspace to use all this stuff too. I just expect that we will see all permutations of producer/consumer + userspace/kernel combinations, so I want to retain that "all producers have left" notification feature set. Today eventfd supports producers or consumers in userspace, and producers in the kernel. This new work we are doing adds consumer support in the kernel. Kernel to kernel is just a natural extension of that. -Greg > There are more effective ways to have in kernel communication channels, > than resorting to userspace link facilities like eventfd. > > > > - Davide > > > -- > To unsubscribe from this list: send the line "unsubscribe kvm" in > the body of a message to majordomo@xxxxxxxxxxxxxxx > More majordomo info at http://vger.kernel.org/majordomo-info.html >
Attachment:
signature.asc
Description: OpenPGP digital signature