Re: [PATCH RFC v2] vfio: Documentation for the migration region

Alex Williamson <alex.williamson@xxxxxxxxxx> · Tue, 7 Dec 2021 09:22:39 -0700

On Tue, 7 Dec 2021 11:37:43 -0400
Jason Gunthorpe <jgg@xxxxxxxxxx> wrote:

> On Tue, Dec 07, 2021 at 11:50:47AM +0100, Cornelia Huck wrote:
> > On Mon, Dec 06 2021, Jason Gunthorpe <jgg@xxxxxxxxxx> wrote:
> >   
> > > On Fri, Dec 03, 2021 at 11:06:19AM -0700, Alex Williamson wrote:  
> >   
> > >> This is exactly the sort of "designed for QEMU implementation"
> > >> inter-operability that I want to avoid.  It doesn't take much of a
> > >> crystal ball to guess that gratuitous and redundant device resets
> > >> slow VM instantiation and are a likely target for optimization.  
> > >
> > > Sorry, but Linus's "don't break userspace" forces us to this world.
> > >
> > > It does not matter what is written in text files, only what userspace
> > > actually does and the kernel must accommodate existing userspace going
> > > forward. So once released qemu forms some definitive spec and the
> > > guardrails that limit what we can do going forward.  
> > 
> > But QEMU support is *experimental*, i.e. if it breaks, you get to keep
> > the pieces, things may change in incompatible ways. And it is
> > experimental for good reason!  
> 
> And we can probably make an breakage exception for this existing
> experimental qemu.
> 
> My point was going forward, once we userspace starts to become
> deployed, it doesn't matter what we write in these text files and
> comments. It only matters what deployed userspace actually does.

I think we're losing sight of my concern in designing for QEMU.  The
document included a statement that migration driver writers could rely
on userspace performing a device reset prior to entering the RESUMING
device_state because of an unfounded correlation that QEMU resets the
VM on the way to loading device state.  Now, if we say QEMU does this
thing and we need to support that usage model, I'm 100% on board.  If
we turn it around and say QEMU does this thing therefore migration
drivers can expect exactly this usage model, full stop, that's the
wrong direction.  That is what I'm trying to avoid.

The obvious way to remove the any question of breaking userspace is to
simply rev the migration region sub-type.  The kernel stops exposing any
v1 sub-types, we don't break any userspaces, userspaces need to be
updated to v2 in order to continue having any functionality.  Thanks,

Alex