Re: [RFC PATCH] vfio: Update/Clarify migration uAPI, add NDMA state

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Jan 11, 2022 at 03:14:04AM +0000, Tian, Kevin wrote:
> > From: Jason Gunthorpe <jgg@xxxxxxxxxx>
> > Sent: Tuesday, January 11, 2022 2:12 AM
> > 
> > On Mon, Jan 10, 2022 at 07:55:16AM +0000, Tian, Kevin wrote:
> > 
> > > > > {SAVING} -> {RESUMING}
> > > > > 	If not supported, user can achieve this via:
> > > > > 		{SAVING}->{RUNNING}->{RESUMING}
> > > > > 		{SAVING}-RESET->{RUNNING}->{RESUMING}
> > > >
> > > > This can be:
> > > >
> > > > SAVING -> STOP -> RESUMING
> > >
> > > From Alex's original description the default device state is RUNNING.
> > > This supposed to be the initial state on the dest machine for the
> > > device assigned to Qemu before Qemu resumes the device state.
> > > Then how do we eliminate the RUNNING state in above flow? Who
> > > makes STOP as the initial state on the dest node?
> > 
> > All of this notation should be read with the idea that the
> > device_state is already somehow moved away from RESET. Ie the above
> > notation is about what is possible once qemu has already moved the
> > device to SAVING.
> 
> Qemu moves the device to SAVING on the src node.
> 
> On the dest the device is in RUNNING (after reset) which can be directly
> transitioned to RESUMING. I didn't see the point of adding a STOP here.

Alex is talking about the same node case where qemu has put the device
into SAVING and then, for whatever reason, decides it now wants the
device to be in RESUMING.

We are talking about the state space of commands the driver has to
process here. If we can break down things like SAVING -> RESUMING into
two commands:

 SAVING -> STOP
 STOP -> RESUMING

Then the driver has to implement fewer arcs, and the arcs it does
implement are much simpler.

It also resolves the precedence question nicely as we have a core FSM
that is built on the arcs the drivers implement and that in turn gives
a natural answer to the question of how do you transit between any two
states.

Eg using the state names I gave earlier we can look at going from
RESUMING -> PRE_COPY_NDMA and decomposing it into these four steps:

  RESUMING -> STOP -> RUNNING -> PRE_COPY -> PRE_COPY_P2P

In the end the driver needs to implement only about half of the total
arcs and the ones it does need to implement are simpler and have a
more obvious implementation.

> Later when supporting hw mdev (with pasid granular isolation in
> iommu), this restriction can be uplifted as it doesn't use dma api
> and is pretty much like a pdev regarding to ioas management.

When I say 'mdev' I really mean things that use the vfio pinning
interface - which we don't quite have a proper name for yet (though
emulated iommu perhaps is sticking)

Things that use iommu_domain would not be a problem

Jason



[Index of Archives]     [Kernel Newbies]     [Security]     [Netfilter]     [Bugtraq]     [Linux FS]     [Yosemite Forum]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Samba]     [Video 4 Linux]     [Device Mapper]     [Linux Resources]

  Powered by Linux