Re: [RFC 1/2] KVM: add initial support for KVM_SET_IOREGION

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




On 2021/1/15 上午12:16, Stefan Hajnoczi wrote:
On Thu, Jan 14, 2021 at 12:05:00PM +0800, Jason Wang wrote:
On 2021/1/13 下午11:52, Stefan Hajnoczi wrote:
On Wed, Jan 13, 2021 at 10:38:29AM +0800, Jason Wang wrote:
On 2021/1/8 上午1:53, Stefan Hajnoczi wrote:
On Thu, Jan 07, 2021 at 11:30:47AM +0800, Jason Wang wrote:
On 2021/1/6 下午11:05, Stefan Hajnoczi wrote:
On Wed, Jan 06, 2021 at 01:21:43PM +0800, Jason Wang wrote:
On 2021/1/5 下午6:25, Stefan Hajnoczi wrote:
On Tue, Jan 05, 2021 at 11:53:01AM +0800, Jason Wang wrote:
On 2021/1/5 上午8:02, Elena Afanasova wrote:
On Mon, 2021-01-04 at 13:34 +0800, Jason Wang wrote:
On 2021/1/4 上午4:32, Elena Afanasova wrote:
On Thu, 2020-12-31 at 11:45 +0800, Jason Wang wrote:
On 2020/12/29 下午6:02, Elena Afanasova wrote:
2. If separate userspace threads process the virtqueues, then set up the
      virtio-pci capabilities so the virtqueues have separate notification
      registers:
      https://docs.oasis-open.org/virtio/virtio/v1.1/cs01/virtio-v1.1-cs01.html#x1-1150004
Right. But this works only when PCI transport is used and queue index could
be deduced from the register address (separated doorbell).

If we use MMIO or sharing the doorbell registers among all the virtqueues
(multiplexer is zero in the above case) , it can't work without datamatch.
True. Can you think of an application that needs to dispatch a shared
doorbell register to several threads?

I think it depends on semantic of doorbell register. I guess one example is
the virito-mmio multiqueue device.
Good point. virtio-mmio really needs datamatch if virtqueues are handled
by different threads.

If this is a case that real-world applications need then we should
tackle it. This is where eBPF would be appropriate. I guess the
interface would be something like:

    /*
     * A custom demultiplexer function that returns the index of the <wfd,
     * rfd> pair to use or -1 to produce a KVM_EXIT_IOREGION_FAILURE that
     * userspace must handle.
     */
    int demux(const struct ioregionfd_cmd *cmd);

Userspace can install an eBPF demux function as well as an array of
<wfd, rfd> fd pairs. The demux function gets to look at the cmd in order
to decide which fd pair it is sent to.

This is how I think eBPF datamatch could work. It's not as general as in
our original discussion where we also talked about custom protocols
(instead of struct ioregionfd_cmd/struct ioregionfd_resp).

Actually they are not conflict. We can make it a eBPF ioregion, then it's
the eBPF program that can decide:

1) whether or not it need to do datamatch
2) how many file descriptors it want to use (store the fd in a map)
3) how will the protocol looks like

But as discussed it could be an add-on on top of the hard logic of ioregion
since there could be case that eBPF may not be allowed not not supported. So
adding simple datamatch support as a start might be a good choice.
Let's go further. Can you share pseudo-code for the eBPF program's
function signature (inputs/outputs)?


It could be something like this:

1) The eBPF program context could be defined as ioregion_ctx:

struct ioregion_ctx {
    gpa_t addr;
    int len;
    void *val;
};

2) The eBPF program return value could be, 0 (IOREGION_OK) means that the the program can handle this I/O request, otherwise failure (IOREGION_FAIL)

So for implementing the datamatch, userspace is required to stored the file descriptors for doorbell dispatching in a map (dispatch_map). For virtio style doorbell, we can simply:

- find the fd via bpf map lookup
- build the protocol
- use the eBPF helper to send the command (I don't check but I guess we need invent new eBPF helpers for read and write from a file)

Like:

SEC("datamatch")
int datamatch_prog(struct ioregion_ctx *ctx)
{
    int *fd, ret;
    struct customized_protocol protocol;
    fd = bpf_map_lookup_elem(&ctx->val, &dispatch_map);
    if (!fd)
        return IOREGION_FAIL;
    build_protocol(ctx, &protocol);
    ret = bpf_fd_write(fd, &protocol, sizeof(protocol);
    if (ret != sizeof(protocol))
        return IOREGION_FAIL;
    return IOREGION_OK;
}

Thanks



Stefan




[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux