On Mon, Sep 25, 2023 at 09:26:07AM -0300, Jason Gunthorpe wrote: > > > So, as I keep saying, in this scenario the goal is no mediation in the > > > hypervisor. > > > > That's pretty fine, but I don't think trapping + relying is not > > mediation. Does it really matter what happens after trapping? > > It is not mediation in the sense that the kernel driver does not in > any way make decisions on the behavior of the device. It simply > transforms an IO operation into a device command and relays it to the > device. The device still fully controls its own behavior. > > VDPA is very different from this. You might call them both mediation, > sure, but then you need another word to describe the additional > changes VPDA is doing. Sorry about hijacking the thread a little bit, but could you call out some of the changes that are the most problematic for you? > > > It is pointless, everything you think you need to do there > > > is actually already being done in the DPU. > > > > Well, migration or even Qemu could be offloaded to DPU as well. If > > that's the direction that's pretty fine. > > That's silly, of course qemu/kvm can't run in the DPU. > > However, we can empty qemu and the hypervisor out so all it does is > run kvm and run vfio. In this model the DPU does all the OVS, storage, > "VPDA", etc. qemu is just a passive relay of the DPU PCI functions > into VM's vPCI functions. > > So, everything VDPA was doing in the environment is migrated into the > DPU. > > In this model the DPU is an extension of the hypervisor/qemu > environment and we shift code from x86 side to arm side to increase > security, save power and increase total system performance. > > Jason I think I begin to understand. On the DPU you have some virtio devices but also some non-virtio devices. So you have to use VFIO to talk to the DPU. Reusing VFIO to talk to virtio devices too, simplifies things for you. If guests will see vendor-specific devices from the DPU anyway, it will be impossible to migrate such guests away from the DPU so the cross-vendor migration capability is less important in this use-case. Is this a good summary? -- MST