Re: MMIO/PIO dispatch file descriptors (ioregionfd) design discussion

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




On 2020/12/1 下午6:35, Stefan Hajnoczi wrote:
On Tue, Dec 01, 2020 at 12:05:04PM +0800, Jason Wang wrote:
On 2020/11/30 下午8:47, Stefan Hajnoczi wrote:
On Mon, Nov 30, 2020 at 10:14:15AM +0800, Jason Wang wrote:
On 2020/11/27 下午9:44, Stefan Hajnoczi wrote:
On Fri, Nov 27, 2020 at 11:39:23AM +0800, Jason Wang wrote:
On 2020/11/26 下午8:36, Stefan Hajnoczi wrote:
On Thu, Nov 26, 2020 at 11:37:30AM +0800, Jason Wang wrote:
On 2020/11/26 上午3:21, Elena Afanasova wrote:
Or I wonder whether we can attach an eBPF program when trapping MMIO/PIO and
allow it to decide how to proceed?
The eBPF program approach is interesting, but it would probably require
access to guest RAM and additional userspace state (e.g. device-specific
register values). I don't know the current status of Linux eBPF - is it
possible to access user memory (it could be swapped out)?
AFAIK it doesn't, but just to make sure I understand, any reason that eBPF
need to access userspace memory here?
Maybe we're thinking of different things. In the past I've thought about
using eBPF to avoid a trip to userspace for request submission and
completion, but that requires virtqueue parsing from eBPF and guest RAM
access.

I see. I've  considered something similar. e.g using eBPF dataplane in
vhost, but it requires a lot of work. For guest RAM access, we probably can
provide some eBPF helpers to do that but we need strong point to convince
eBPF guys.


Are you thinking about replacing ioctl(KVM_SET_IOREGION) and all the
necessary kvm.ko code with an ioctl(KVM_SET_IO_PROGRAM), where userspace
can load an eBPF program into kvm.ko that gets executed when an MMIO/PIO
accesses occur?

Yes.


   Wouldn't it need to write to userspace memory to store
the ring index that was written to the doorbell register, for example?

The proram itself can choose want to do:

1) do datamatch and write/wakeup eventfd

or

2) transport the write via an arbitrary fd as what has been done in this
proposal, but the protocol is userspace defined

How would the program communicate with userspace (eventfd isn't enough)
and how can it handle synchronous I/O accesses like reads?

I may miss something, but it can behave exactly as what has been proposed in
this patch?
I see. This seems to have two possible advantages:
1. Pushing the kvm.ko code into userspace thanks to eBPF. Less kernel
    code.
2. Allowing more flexibile I/O dispatch logic (e.g. ioeventfd-style
    datamatch) and communication protocols.

I think #1 is minor because the communication protocol is trivial,
struct kvm_io_device can be reused for dispatch, and eBPF will introduce
some complexity itself.

#2 is more interesting but I'm not sure how to use this extra
flexibility to get a big advantage. Maybe vfio-user applications could
install an eBPF program that speaks the vfio-user protocol instead of
the ioregionfd protocol, making it easier to integrate ioregionfd into
vfio-user programs?


Yes, that's could be one. Basically it shift the policy from kernel to userspace.



My opinion is that eBPF complicates things and since we lack a strong
use case for that extra flexibility, I would stick to the ioregionfd
proposal.

Elena, Jason: Do you have any opinions on this?


I agree. And we need a way to make it work without eBPF. Let's leave it for future investigation.

Thanks



Stefan




[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux