On Wed, Dec 07, 2022 at 09:34:14AM -0400, Jason Gunthorpe wrote: > The VFIO design assumes that the "vfio migration driver" will talk to > both functions under the hood, and I don't see a fundamental problem > with this beyond it being awkward with the driver core. And while that is a fine concept per see, the current incarnation of that is fundamentally broken is it centered around the controlled VM. Which really can't work. > Even the basic assumption that there would be a controlling/controlled > relationship is not universally true. The mdev type drivers, and > SIOV-like devices are unlikely to have that. Once you can use PASID > the reasons to split things at the HW level go away, and a VF could > certainly self-migrate. Even then you need a controlling and a controlled entity. The controlling entity even in SIOV remains a PCIe function. The controlled entity might just be a bunch of hardware resoures and a PASID. Making it important again that all migration is driven by the controlling entity. Also the whole concept that only VFIO can do live migration is a little bogus. With checkpoint and restart it absolutely does make sense to live migrate a container, and with that the hardware interface (e.g. nvme controller) assigned to it. > So, when you see both Intel and Pensando proposing this kind of > layered model for NVMe where migration is subsystem-local to VFIO, I > think this is where the inspiration is coming from. Their native DPU > drivers already work this way. Maybe they should have talked to someone not high on their own supply before designing this.