On Thu, 23 Jul 2020 21:19:30 -0300 Jason Gunthorpe <jgg@xxxxxxxxxxxx> wrote: > On Tue, Jul 21, 2020 at 11:54:49PM +0000, Tian, Kevin wrote: > > In a nutshell, applications don't require raw WQ controllability as guest > > kernel drivers may expect. Extending DSA user space interface to be another > > passthrough interface just for virtualization needs is less compelling than > > leveraging established VFIO/mdev framework (with the major merit that > > existing user space VMMs just work w/o any change as long as they already > > support VFIO uAPI). > > Sure, but the above is how the cover letter should have summarized > that discussion, not as "it is not much code difference" > > > In last review you said that you didn't hard nak this approach and would > > like to hear opinion from virtualization guys. In this version we CCed KVM > > mailing list, Paolo (VFIO/Qemu), Alex (VFIO), Samuel (Rust-VMM/Cloud > > hypervisor), etc. Let's see how they feel about this approach. > > Yes, the VFIO community should decide. > > If we are doing emulation tasks in the kernel now, then I can think of > several nice semi-emulated mdevs to propose. > > This will not be some one off, but the start of a widely copied > pattern. And that's definitely a concern, there should be a reason for implementing device emulation in the kernel beyond an easy path to get a device exposed up through a virtualization stack. The entire idea of mdev is the mediation of access to a device to make it safe for a user and to fit within the vfio device API. Mediation, emulation, and virtualization can be hard to differentiate, and there is some degree of emulation required to fill out the device API, for vfio-pci itself included. So I struggle with a specific measure of where to draw the line, and also whose authority it is to draw that line. I don't think it's solely mine, that's something we need to decide as a community. If you see this as an abuse of the framework, then let's identify those specific issues and come up with a better approach. As we've discussed before, things like basic PCI config space emulation are acceptable overhead and low risk (imo) and some degree of register emulation is well within the territory of an mdev driver. Drivers are accepting some degree of increased attack surface by each addition of a uAPI and the complexity of those uAPIs, but it seems largely a decision for those drivers whether they're willing to take on that responsibility and burden. At some point, possibly in the near-ish future, we might have a vfio-user interface with userspace vfio-over-socket servers that might be able to consume existing uAPIs and offload some of this complexity and emulation to userspace while still providing an easy path to insert devices into the virtualization stack. Hopefully if/when that comes along, it would provide these sorts of drivers an opportunity to offload some of the current overhead out to userspace, but I'm not sure it's worth denying a mainline implementation now. Thanks, Alex