RE: [PATCH V8 mlx5-next 09/15] vfio: Define device migration protocol v2

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



> From: Alex Williamson <alex.williamson@xxxxxxxxxx>
> Sent: Wednesday, February 23, 2022 9:10 AM
> > > > + * The kernel migration driver must fully transition the device to the
> new state
> > > > + * value before the operation returns to the user.
> > >
> > > The above statement certainly doesn't preclude asynchronous
> > > availability of data on the stream FD, but it does demand that the
> > > device state transition itself is synchronous and can cannot be
> > > shortcut.  If the state transition itself exceeds migration SLAs, we're
> > > in a pickle.  Thanks,
> >
> > Even if the commands were async, it is not easy to believe a device
> > can instantaneously abort an arc when a timer hits and return to full
> > operation. For instance, mlx5 can't do this.
> >
> > The vCPU cannot be restarted to try to meet the SLA until a command
> > going back to RUNNING returns.
> >
> > If we want to have a SLA feature it feels better to pass in the
> > deadline time as part of the set state ioctl and the driver can then
> > internally do something appropriate and not have to figure out how to
> > juggle an external abort. The driver would be expected to return fully
> > completed from STOP or return back to RUNNING before the deadline.
> >
> > For instance mlx5 could possibly implement this by checking the
> > migration size and doing some maths before deciding if it should
> > commit to its unabortable device command.
> >
> > I have a feeling supporting SLA means devices are going to have to
> > report latencies for various arcs and work in a more classical
> > realtime deadline oriented way overall. Estimating the transfer
> > latency and size is another factor too.
> >
> > Overall, this SLA topic looks quite big to me, and I think a full
> > solution will come with many facets. We are also quite interested in
> > dirty rate limiting, for instance.
> 
> So if/when we were to support this, we might use a different SET_STATE
> feature ioctl that allows the user to specify a deadline and we'd use
> feature probing or a flag on the migration feature for userspace to
> discover this?  I'd be ok with that, I just want to make sure we have
> agreeable options to support it.  Thanks,
> 

Or use a different device_feature ioctl to allow setting deadline 
for different arcs before changing device state and then reuse
existing SET_STATE semantics with the migration driver doing
estimation underlyingly based on pre-configured constraints...

Thanks
Kevin




[Index of Archives]     [DMA Engine]     [Linux Coverity]     [Linux USB]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [Greybus]

  Powered by Linux