> -----Original Message----- > From: Max Gurtovoy [mailto:mgurtovoy@xxxxxxxxxx] > Sent: 23 September 2021 14:56 > To: Leon Romanovsky <leon@xxxxxxxxxx>; Shameerali Kolothum Thodi > <shameerali.kolothum.thodi@xxxxxxxxxx> > Cc: Doug Ledford <dledford@xxxxxxxxxx>; Jason Gunthorpe > <jgg@xxxxxxxxxx>; Yishai Hadas <yishaih@xxxxxxxxxx>; Alex Williamson > <alex.williamson@xxxxxxxxxx>; Bjorn Helgaas <bhelgaas@xxxxxxxxxx>; David > S. Miller <davem@xxxxxxxxxxxxx>; Jakub Kicinski <kuba@xxxxxxxxxx>; Kirti > Wankhede <kwankhede@xxxxxxxxxx>; kvm@xxxxxxxxxxxxxxx; > linux-kernel@xxxxxxxxxxxxxxx; linux-pci@xxxxxxxxxxxxxxx; > linux-rdma@xxxxxxxxxxxxxxx; netdev@xxxxxxxxxxxxxxx; Saeed Mahameed > <saeedm@xxxxxxxxxx> > Subject: Re: [PATCH mlx5-next 2/7] vfio: Add an API to check migration state > transition validity > > > On 9/23/2021 2:17 PM, Leon Romanovsky wrote: > > On Thu, Sep 23, 2021 at 10:33:10AM +0000, Shameerali Kolothum Thodi > wrote: > >> > >>> -----Original Message----- > >>> From: Leon Romanovsky [mailto:leon@xxxxxxxxxx] > >>> Sent: 22 September 2021 11:39 > >>> To: Doug Ledford <dledford@xxxxxxxxxx>; Jason Gunthorpe > <jgg@xxxxxxxxxx> > >>> Cc: Yishai Hadas <yishaih@xxxxxxxxxx>; Alex Williamson > >>> <alex.williamson@xxxxxxxxxx>; Bjorn Helgaas <bhelgaas@xxxxxxxxxx>; > David > >>> S. Miller <davem@xxxxxxxxxxxxx>; Jakub Kicinski <kuba@xxxxxxxxxx>; Kirti > >>> Wankhede <kwankhede@xxxxxxxxxx>; kvm@xxxxxxxxxxxxxxx; > >>> linux-kernel@xxxxxxxxxxxxxxx; linux-pci@xxxxxxxxxxxxxxx; > >>> linux-rdma@xxxxxxxxxxxxxxx; netdev@xxxxxxxxxxxxxxx; Saeed Mahameed > >>> <saeedm@xxxxxxxxxx> > >>> Subject: [PATCH mlx5-next 2/7] vfio: Add an API to check migration state > >>> transition validity > >>> > >>> From: Yishai Hadas <yishaih@xxxxxxxxxx> > >>> > >>> Add an API in the core layer to check migration state transition validity > >>> as part of a migration flow. > >>> > >>> The valid transitions follow the expected usage as described in > >>> uapi/vfio.h and triggered by QEMU. > >>> > >>> This ensures that all migration implementations follow a consistent > >>> migration state machine. > >>> > >>> Signed-off-by: Yishai Hadas <yishaih@xxxxxxxxxx> > >>> Reviewed-by: Kirti Wankhede <kwankhede@xxxxxxxxxx> > >>> Signed-off-by: Jason Gunthorpe <jgg@xxxxxxxxxx> > >>> Signed-off-by: Leon Romanovsky <leonro@xxxxxxxxxx> > >>> --- > >>> drivers/vfio/vfio.c | 41 > +++++++++++++++++++++++++++++++++++++++++ > >>> include/linux/vfio.h | 1 + > >>> 2 files changed, 42 insertions(+) > >>> > >>> diff --git a/drivers/vfio/vfio.c b/drivers/vfio/vfio.c > >>> index 3c034fe14ccb..c3ca33e513c8 100644 > >>> --- a/drivers/vfio/vfio.c > >>> +++ b/drivers/vfio/vfio.c > >>> @@ -1664,6 +1664,47 @@ static int vfio_device_fops_release(struct > inode > >>> *inode, struct file *filep) > >>> return 0; > >>> } > >>> > >>> +/** > >>> + * vfio_change_migration_state_allowed - Checks whether a migration > state > >>> + * transition is valid. > >>> + * @new_state: The new state to move to. > >>> + * @old_state: The old state. > >>> + * Return: true if the transition is valid. > >>> + */ > >>> +bool vfio_change_migration_state_allowed(u32 new_state, u32 > old_state) > >>> +{ > >>> + enum { MAX_STATE = VFIO_DEVICE_STATE_RESUMING }; > >>> + static const u8 vfio_from_state_table[MAX_STATE + 1][MAX_STATE + > 1] = { > >>> + [VFIO_DEVICE_STATE_STOP] = { > >>> + [VFIO_DEVICE_STATE_RUNNING] = 1, > >>> + [VFIO_DEVICE_STATE_RESUMING] = 1, > >>> + }, > >>> + [VFIO_DEVICE_STATE_RUNNING] = { > >>> + [VFIO_DEVICE_STATE_STOP] = 1, > >>> + [VFIO_DEVICE_STATE_SAVING] = 1, > >>> + [VFIO_DEVICE_STATE_SAVING | > VFIO_DEVICE_STATE_RUNNING] > >>> = 1, > >> Do we need to allow _RESUMING state here or not? As per the "State > transitions" > >> section from uapi/linux/vfio.h, > > It looks like we missed this state transition. > > > > Thanks > > I'm not sure this state transition is valid. > > Kirti, When we would like to move from RUNNING to RESUMING ? I guess it depends on what you report as your dev default state. For HiSilicon ACC migration driver, we set the default to _RUNNING. And when the migration starts, the destination side Qemu, set the device state to _RESUMING(vfio_load_state()). >From the documentation, it looks like the assumption on default state of the VFIO dev is _RUNNING. " * 001b => Device running, which is the default state " > > Sameerali, can you please re-test and update if you see this transition ? Yes. And if I change the default state to _STOP, then the transition is from _STOP --> _RESUMING. But the documentation on State transitions doesn't have _STOP --> _RESUMING transition as valid. Thanks, Shameer > > > > > >> " * 4. To start the resuming phase, the device state should be transitioned > from > >> * the _RUNNING to the _RESUMING state." > >> > >> IIRC, I have seen that transition happening on the destination dev while > testing the > >> HiSilicon ACC dev migration. > >> > >> Thanks, > >> Shameer > >> > >>> + }, > >>> + [VFIO_DEVICE_STATE_SAVING] = { > >>> + [VFIO_DEVICE_STATE_STOP] = 1, > >>> + [VFIO_DEVICE_STATE_RUNNING] = 1, > >>> + }, > >>> + [VFIO_DEVICE_STATE_SAVING | VFIO_DEVICE_STATE_RUNNING] > = { > >>> + [VFIO_DEVICE_STATE_RUNNING] = 1, > >>> + [VFIO_DEVICE_STATE_SAVING] = 1, > >>> + }, > >>> + [VFIO_DEVICE_STATE_RESUMING] = { > >>> + [VFIO_DEVICE_STATE_RUNNING] = 1, > >>> + [VFIO_DEVICE_STATE_STOP] = 1, > >>> + }, > >>> + }; > >>> + > >>> + if (new_state > MAX_STATE || old_state > MAX_STATE) > >>> + return false; > >>> + > >>> + return vfio_from_state_table[old_state][new_state]; > >>> +} > >>> +EXPORT_SYMBOL_GPL(vfio_change_migration_state_allowed); > >>> + > >>> static long vfio_device_fops_unl_ioctl(struct file *filep, > >>> unsigned int cmd, unsigned long arg) > >>> { > >>> diff --git a/include/linux/vfio.h b/include/linux/vfio.h > >>> index b53a9557884a..e65137a708f1 100644 > >>> --- a/include/linux/vfio.h > >>> +++ b/include/linux/vfio.h > >>> @@ -83,6 +83,7 @@ extern struct vfio_device > >>> *vfio_device_get_from_dev(struct device *dev); > >>> extern void vfio_device_put(struct vfio_device *device); > >>> > >>> int vfio_assign_device_set(struct vfio_device *device, void *set_id); > >>> +bool vfio_change_migration_state_allowed(u32 new_state, u32 > old_state); > >>> > >>> /* events for the backend driver notify callback */ > >>> enum vfio_iommu_notify_type { > >>> -- > >>> 2.31.1