> From: Jason Gunthorpe <jgg@xxxxxxxxxx> > Sent: Tuesday, February 15, 2022 11:34 PM > > > > +#define VFIO_DEVICE_STATE_V1_STOP (0) > > > +#define VFIO_DEVICE_STATE_V1_RUNNING (1 << 0) > > > +#define VFIO_DEVICE_STATE_V1_SAVING (1 << 1) > > > +#define VFIO_DEVICE_STATE_V1_RESUMING (1 << 2) > > > +#define VFIO_DEVICE_STATE_MASK > (VFIO_DEVICE_STATE_V1_RUNNING > > > | \ > > > + VFIO_DEVICE_STATE_V1_SAVING | \ > > > + VFIO_DEVICE_STATE_V1_RESUMING) > > > > Does it make sense to also add 'V1' to MASK and also following macros > > given their names are general? > > No, the point of this exercise is to avoid trouble for qemu - the > fewest changes we can get away with the better. > > Once qemu is updated we'll delete this old stuff from the kernel. sounds good. > > > > +/* > > > + * Indicates the device can support the migration API. See enum > > > > call it V2? Not necessary to add V2 in code but worthy of a clarification > > in comment. > > We've only called it 'v2' for discussions. > > If you think it is unclear lets say 'support the migration API through > VFIO_DEVICE_FEATURE_MIG_DEVICE_STATE' yes, that's clearer. > > > + * > > > + * STOP -> STOP_COPY > > > + * This arc begin the process of saving the device state and will return a > > > + * new data_fd. > > > + * > > > + * While in the STOP_COPY state the device has the same behavior as > STOP > > > + * with the addition that the data transfers session continues to stream > the > > > + * migration state. End of stream on the FD indicates the entire device > > > + * state has been transferred. > > > + * > > > + * The user should take steps to restrict access to vfio device regions > while > > > + * the device is in STOP_COPY or risk corruption of the device migration > > > data > > > + * stream. > > > > Restricting access has been explained in the to-STOP arcs and it is stated > > that while in STOP_COPY the device has the same behavior as STOP. So > > I think the last paragraph is possibly not required. > > It is not the same, the language in STOP is saying that the device > must tolerate external touches without breaking the kernel > > This language is saying if external touches happen then the device is > free to corrupt the migration stream. > > In both cases we expect good userspace to not have device > touches, the guidance here is for driver authors about what kind of > steps they need to take to protect against hostile userspace. fair enough. > > > > + * STOP -> RESUMING > > > + * Entering the RESUMING state starts a process of restoring the device > > > + * state and will return a new data_fd. The data stream fed into the > > > data_fd > > > + * should be taken from the data transfer output of the saving group > states > > > > No definition of 'group state' (maybe introduced in a later patch?) > > Yes, it was added in the P2P patch > > We can avoid talking about saving group here entirely, it really just > means a single FD. > > * The data stream fed into the data_fd should > * be taken from the data transfer output of a single FD during saving on a > * from a compatible device. > Yes. Thanks Kevin