> From: Jason Gunthorpe <jgg@xxxxxxxxxx> > Sent: Wednesday, February 16, 2022 8:14 PM > > On Wed, Feb 16, 2022 at 03:17:36AM +0000, Tian, Kevin wrote: > > > those requests don't rely on vCPUs but still take time to complete > > (thus may break SLA) and are invisible to migration driver (directly > > submitted by the guest thus cannot be estimated). So the only means > > is for user to wait on a fd with a timeout (based on whatever SLA) and > > if expires then aborts migration (may retry later). > > I think I explained in my other email how this can be implemented > today with v2 for STOP_COPY without an event fd. > I suppose you meant this part: "It allows RUNNING -> STOP_COPY to be made async because the driver can return SET_STATE immediately, backround the state save and indicate completion/progress/error via poll(readable) on the data_fd." Yes it could work if the user directly request STOP_COPY as the end state (with STOP as an implicit/immediate step). In that case polling on data_fd with timeout can cover the requirement described for STOP here. Thanks Kevin