On Thu, Sep 30, 2021 at 12:34:19PM +0300, Max Gurtovoy wrote: > > When we add the migration extension this cannot change, so after > > open_device() the device should be operational. > > if it's waiting for incoming migration blob, it is not running. It cannot be waiting for a migration blob after open_device, that is not backwards compatible. Just prior to open device the vfio pci layer will generate a FLR to the function so we expect that post open_device has a fresh from reset fully running device state. > > The reported state in the migration region should accurately reflect > > what the device is currently doing. If the device is operational then > > it must report running, not stopped. > > STOP in migration meaning. As Alex and I have said several times STOP means the internal state is not allowed to change. > > driver will see RESUMING toggle off so it will trigger a > > de-serialization > > You mean stop serialization ? No, I mean it will take all the migration data that has been uploaded through the migration region and de-serialize it into active device state. > > driver will see SAVING toggled on so it will serialize the new state > > (either the pre-copy state or the post-copy state dpending on the > > running bit) > > lets leave the bits and how you implement the state numbering aside. You've missed the point. This isn't a FSM. It is a series of three control bits that we have assigned logical meaning their combinatoins. The algorithm I gave is a control centric algorithm not a state centric algorithm and matches the direction Alex thought this was being designed for. > If you finish resuming you can move to a new state (that we should add) => > RESUMED. It is not a state machine. Once you stop prentending this is implementing a FSM Alex's position makes perfect sense. Jason