On Tue, 26 Oct 2021 09:13:53 -0300 Jason Gunthorpe <jgg@xxxxxxxxxx> wrote: > On Tue, Oct 26, 2021 at 09:40:34AM +0100, Dr. David Alan Gilbert wrote: > > * Jason Gunthorpe (jgg@xxxxxxxxxx) wrote: > > > On Mon, Oct 25, 2021 at 07:47:29PM +0100, Dr. David Alan Gilbert wrote: > > > > > > > It may need some further refinement; for example in that quiesed state > > > > do counters still tick? will a NIC still respond to packets that don't > > > > get forwarded to the host? > > > > > > At least for the mlx5 NIC the two states are 'able to issue outbound > > > DMA' and 'all internal memories and state are frozen and unchanging'. > > > > Yeh, so my point was just that if you're adding a new state to this > > process, you need to define the details like that. > > We are not planning to propose any patches/uAPI specification for this > problem until after the mlx5 vfio driver is merged.. I'm not super comfortable with that. If we're expecting to add a new bit to define a quiescent state prior to clearing the running flag and this is an optional device feature that userspace migration needs to be aware of and it's really not clear from a hypervisor when p2p DMA might be in use, I think that leaves userspace in a pickle how and when they'd impose restrictions on assignment with multiple assigned devices. It's likely that the majority of initial use cases wouldn't need this feature, which would make it difficult to arbitrarily impose later. OTOH, if we define !_RUNNING as quiescent and userspace reading pending_bytes as the point by which the user is responsible for quiescing all devices and the device state becomes stable (or drivers can generate errors during collection of device state if that proves otherwise), then I think existing userspace doesn't care about this issue. Thanks, Alex