RE: [RFC PATCH] vfio: Update/Clarify migration uAPI, add NDMA state

"Tian, Kevin" <kevin.tian@xxxxxxxxx> · Mon, 10 Jan 2022 03:14:44 +0000

> From: Jason Gunthorpe <jgg@xxxxxxxxxx>
> Sent: Saturday, January 8, 2022 1:23 AM
> 
> On Fri, Jan 07, 2022 at 02:01:55AM +0000, Tian, Kevin wrote:
> > > From: Jason Gunthorpe <jgg@xxxxxxxxxx>
> > > Sent: Friday, January 7, 2022 8:30 AM
> > >
> > > On Fri, Jan 07, 2022 at 12:00:13AM +0000, Tian, Kevin wrote:
> > > > > Devices that are poorly designed here will have very long migration
> > > > > downtime latencies and people simply won't want to use them.
> > > >
> > > > Different usages have different latency requirement. Do we just want
> > > > people to decide whether to manage state for a device by
> > > > measurement?
> > >
> > > It doesn't seem unreasonable to allow userspace to set max timer for
> > > NDMA for SLA purposes on devices that have unbounded NDMA times. It
> > > would probably be some new optional ioctl for devices that can
> > > implement it.
> >
> > Yes, that's my point.
> >
> > >
> > > However, this basically gives up on the idea that a VM can be migrated
> > > as any migration can timeout and fail under this philosophy. I think
> > > that is still very poor.
> > >
> > > Optional migration really can't be sane path forward.
> > >
> >
> > How is it different from the scenario where the guest generates a very
> > high dirty rate so the precopy phase can never converge to a pre-defined
> > threshold then abort the migration after certain timeout?
> 
> The hypervisor can halt the VCPU and put a stop to this and complete
> the migration.
> 
> There is a difference between optional migration under a SLA and
> mandatory migration with no SLA - I think both must be supported to be
> sane.
> 
> > IMHO live migration is always a try-and-fail flavor. A previous migration
> > failure doesn't prevent the orchestration stack to retry at a later point.
> 
> An operator might need to emergency migrate a VM without the
> possibility for failure. For instance there is something wrong with
> the base HW. SLA ignored, migration must be done.

How is it done today when no assigned device supports migration?
If any constraint is tolerable today, I don't see why supporting only
optional migration cannot be accepted  which removes some 
constraints while still bears a subset in those deployments. This can
be seen as an intermediate step in the transition path toward the 
perfect world where both optional and mandatory migration are 
supported.

> 
> IMHO it is completely wrong to view migration as optional, that is a
> terrible standard to design HW to.
> 

I don't want to argue 'wrong' or 'terrible' here since different person
certainly has different view based their own usages.

But based on the whole discussion I hope we are aligned on:

- It's necessary to support existing HW though it may only supports
  optional migration due to unbounded time of stopping DMA;

- We should influence IP designers to design HW to allow preempting
  in-fly requests and stop DMA quickly (also implying the capability of
  aborting/resuming in-fly PRI requests);

- Specific to the device state management uAPI, it should not assume
  a specific usage and instead allow the user to set a timeout value so
  transitioning to NDMA is failed if the operation cannot be completed
  within the specified timeout value. If the user doesn't set it, the 
  migration driver could conservatively use a default timeout value to
  gate any potentially unbounded operation.

Thanks
Kevin