Re: Proposal for MMIO/PIO dispatch file descriptors (ioregionfd)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Feb 25, 2020 at 12:19:55PM +0000, Felipe Franciosi wrote:
> Hi,
> 
> This looks amazing, Stefan. The lack of such a mechanism troubled us
> during the development of MUSER and resulted in the slow-path we have
> today for MMIO with register semantics (when writes cannot be
> overwritten before the device emulator has a chance to process them).
> 
> I have added some comments inline, but wanted to first link your
> proposal with an idea that I discussed with Maxim Levitsky back in
> Lyon and evolve on it a little bit. IIRC/IIUC Maxim was keen on a VT-x
> extension where a CPU could IPI another to handle events which would
> normally cause a VMEXIT. That is probably more applicable to the
> standard ioeventfd model, but it got me thinking about PML.
> 
> Bear with me. :)
> 
> In addition to an fd, which could be used for notifications only, the
> wire protocol could append "struct ioregionfd_cmd"s (probably renamed
> to drop "fd") to one or more pages (perhaps a ring buffer of sorts).
> 
> That would only work for writes; reads would still be synchronous.
> 
> The device emulator therefore doesn't have to respond to each write
> command. It could process the whole lot whenever it gets around to it.

Yes, the design supports that as follows:

1. Set the KVM_IOREGIONFD_POSTED_WRITES flag on regions that require
   asynchronous writes.
2. Use io_uring IORING_OP_READ with N * sizeof(struct ioregionfd_cmd)
   where N is the maximum number of commands to be processed in a batch.
   Also make sure the socket sndbuf is at least this large if you're
   using a socket fd.
3. Poll on the io_uring cq ring from userspace.  No syscalls are
   required on the fd.  If io_uring sq polling is also enabled then no
   syscalls may be required at all.

If the fd ceases to be writeable because the device emulation program is
not reading struct ioregionfd_cmds quickly enough then the vCPU will be
blocked until the fd becomes writeable again.  But this shouldn't be an
issue for a poll mode device emulation program.

> Most importantly (and linking back to the VT-x extension idea), maybe
> we can avoid the VMEXIT altogether if the CPU could take care of
> appending writes to that buffer. Thoughts?

A VT-x extension would be nice.  It would probably need to be much
simpler in terms of memory region data structures and complexity though
(because of the hardware implementation of this feature).  For example,
just a single buffer for *all* MMIO/PIO accesses made by a vCPU.  That
becomes difficult to use efficiently when there are multiple device
emulation processes.  It's an interesting idea though.

Stefan

Attachment: signature.asc
Description: PGP signature


[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux