Support an inter-vm shared memory device that maps a shared-memory object as a PCI device in the guest. This patch also supports interrupts between guest by communicating over a unix domain socket. This patch applies to the qemu-kvm repository. Changes in this version are using the qdev format and optional use of MSI and ioeventfd/irqfd. The non-interrupt version is supported by passing the shm parameter -device ivshmem,size=<size in MB>,[shm=<shm_name>] which will simply map the shm object into a BAR. Interrupts are supported between multiple VMs by using a shared memory server that is connected to with a socket character device -device ivshmem,size=<size in MB>[,chardev=<chardev name>][,irqfd=on] [,msi=on][,nvectors=n] -chardev socket,path=<path>,id=<chardev name> The server passes file descriptors for the shared memory object and eventfds (our interrupt mechanism) to the respective qemu instances. When using interrupts, VMs communicate with a shared memory server that passes the shared memory object file descriptor using SCM_RIGHTS. The server assigns each VM an ID number and sends this ID number to the Qemu process along with a series of eventfd file descriptors, one per guest using the shared memory server. These eventfds will be used to send interrupts between guests. Each guest listens on the eventfd corresponding to their ID and may use the others for sending interrupts to other guests. enum ivshmem_registers { IntrMask = 0, IntrStatus = 4, IVPosition = 8, Doorbell = 12 }; The first two registers are the interrupt mask and status registers. Mask and status are only used with pin-based interrupts. They are unused with MSI interrupts. The IVPosition register is read-only and reports the guest's ID number. Interrupts are triggered when a message is received on the guest's eventfd from another VM. To trigger an event, a guest must write to another guest's Doorbell. The "Doorbells" begin at offset 12. A particular guest's doorbell offset in the MMIO region is equal to guest_id * 32 + Doorbell The doorbell register for each guest is 32-bits. The doorbell-per-guest design was motivated for use with ioeventfd. The semantics of the value written to the doorbell depends on whether the device is using MSI or a regular pin-based interrupt. Regular Interrupts ------------------ If regular interrupts are used (due to either a guest not supporting MSI or the user specifying not to use them on the command-line) then the value written to a guest's doorbell is what the guest's status register will be set to. An status of (2^32 - 1) indicates that a new guest has joined. Guests should not send a message of this value for any other reason. Message Signalled Interrupts ---------------------------- The important thing to remember with MSI is that it is only a signal, no status is set (since MSI interrupts are not shared). All information other than the interrupt itself should be communicated via the shared memory region. MSI is on by default. It can be turned off with the msi=off to the parameter. If the device uses MSI then the value written to the doorbell is the MSI vector that will be raised. Vector 0 is used to notify that a new guest has joined. Vector 0 cannot be triggered by another guest since a value of 0 does not trigger an eventfd. ioeventfd/irqfd --------------- ioeventfd/irqfd is turned on by irqfd=on passed to the device parameter (it is off by default). When using ioeventfd/irqfd the only interrupt value that can be passed to another guest is 1 despite what value is written to a guest's Doorbell. Sample programs, init scripts and the shared memory server are available in a git repo here: www.gitorious.org/nahanni Cam Macdonell (2): Support adding a file to qemu's ram allocation Inter-VM shared memory PCI device Makefile.target | 3 + cpu-common.h | 1 + exec.c | 33 +++ hw/ivshmem.c | 622 +++++++++++++++++++++++++++++++++++++++++++++++++++++++ qemu-char.c | 6 + qemu-char.h | 3 + 6 files changed, 668 insertions(+), 0 deletions(-) create mode 100644 hw/ivshmem.c -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html