On Tue, Oct 19, 2010 at 03:33:41PM +0200, Michael S. Tsirkin wrote: Apologies if you receive this twice, the original message either disappeared or was delayed somehow. > My main concern is with the fact that we add more state > in notifiers that can easily get out of sync with users. > If we absolutely need this state, let's try to at least > document the state machine, and make the API > for state transitions more transparent. I'll try to describe how it works. If you're happy with the design in principle then I can rework the code. Otherwise we can think about a different design. The goal is to use ioeventfd instead of the synchronous pio emulation path that userspace virtqueues use today. Both virtio-blk and virtio-net increase performance with this approach because it does not block the vcpu from executing guest code while the I/O operation is initiated. We want to automatically create an event notifier and setup ioeventfd for each initialized virtqueue. Vhost already uses ioeventfd so it is important not to interfere with devices that have enabled vhost. If vhost is enabled, then the device's virtqueues are off-limits and should not be tampered with. Furthermore, older kernels limit you to 6 ioeventfds per guest. On such systems it is risky to automatically use ioeventfd for userspace virtqueues, since that could take a precious ioeventfd away from another virtio device using vhost. Existing guest configurations would break so it is simplest to avoid using ioeventfd for userspace virtqueues on such hosts. The design adds logic into hw/virtio.c to automatically use ioeventfd for userspace virtqueues. Specific virtio devices like blk and net require no modification. The logic sits below the set_host_notifier() function that vhost uses. This design stays in sync because it speaks two interfaces that allow it to accurately track whether or not to use ioeventfd: 1. virtio_set_host_notifier() is used by vhost. When vhost enables the host notifier we stay out of the way. 2. virtio_reset()/virtio_set_status()/virtio_load() define the device life-cycle and transition the state machine appropriately. Migration is supported. Here is the state machine that tracks a virtqueue: assigned ^ / \ ^ e. / / c. g. \ \ b. / / \ \ / v f. v \ a. offlimits ---------------> deassigned <-- start <--------------- d. a. The virtqueue starts deassigned with no ioeventfd. b. When the device status becomes VIRTIO_CONFIG_S_DRIVER_OK we try to assign an ioeventfd to each virtqueue, except if the 6 ioeventfd limitation is present. c, d. The virtqueue becomes offlimits if vhost enables the host notifier. e. The ioeventfd becomes assigned again when the host notifier is disabled by vhost. f. Except when the 6 ioeventfd limitation is present, then the ioeventfd becomes unassigned because we want to avoid using ioeventfd. g. When the device is reset its virtqueues become deassigned again. Does this make sense? Stefan -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html