On Tue, 7 Dec 2021 11:37:43 -0400 Jason Gunthorpe <jgg@xxxxxxxxxx> wrote: > On Tue, Dec 07, 2021 at 11:50:47AM +0100, Cornelia Huck wrote: > > On Mon, Dec 06 2021, Jason Gunthorpe <jgg@xxxxxxxxxx> wrote: > > > > > On Fri, Dec 03, 2021 at 11:06:19AM -0700, Alex Williamson wrote: > > > > >> This is exactly the sort of "designed for QEMU implementation" > > >> inter-operability that I want to avoid. It doesn't take much of a > > >> crystal ball to guess that gratuitous and redundant device resets > > >> slow VM instantiation and are a likely target for optimization. > > > > > > Sorry, but Linus's "don't break userspace" forces us to this world. > > > > > > It does not matter what is written in text files, only what userspace > > > actually does and the kernel must accommodate existing userspace going > > > forward. So once released qemu forms some definitive spec and the > > > guardrails that limit what we can do going forward. > > > > But QEMU support is *experimental*, i.e. if it breaks, you get to keep > > the pieces, things may change in incompatible ways. And it is > > experimental for good reason! > > And we can probably make an breakage exception for this existing > experimental qemu. > > My point was going forward, once we userspace starts to become > deployed, it doesn't matter what we write in these text files and > comments. It only matters what deployed userspace actually does. I think we're losing sight of my concern in designing for QEMU. The document included a statement that migration driver writers could rely on userspace performing a device reset prior to entering the RESUMING device_state because of an unfounded correlation that QEMU resets the VM on the way to loading device state. Now, if we say QEMU does this thing and we need to support that usage model, I'm 100% on board. If we turn it around and say QEMU does this thing therefore migration drivers can expect exactly this usage model, full stop, that's the wrong direction. That is what I'm trying to avoid. The obvious way to remove the any question of breaking userspace is to simply rev the migration region sub-type. The kernel stops exposing any v1 sub-types, we don't break any userspaces, userspaces need to be updated to v2 in order to continue having any functionality. Thanks, Alex