On Thu, Jul 18, 2024 at 2:29 AM Steven Sistare <steven.sistare@xxxxxxxxxx> wrote: > > On 7/16/2024 1:30 AM, Jason Wang wrote: > > On Mon, Jul 15, 2024 at 10:29 PM Steven Sistare > > <steven.sistare@xxxxxxxxxx> wrote: > >> > >> On 7/14/2024 10:14 PM, Jason Wang wrote: > >>> On Fri, Jul 12, 2024 at 9:19 PM Steve Sistare <steven.sistare@xxxxxxxxxx> wrote: > >>>> > >>>> Live update is a technique wherein an application saves its state, exec's > >>>> to an updated version of itself, and restores its state. Clients of the > >>>> application experience a brief suspension of service, on the order of > >>>> 100's of milliseconds, but are otherwise unaffected. > >>>> > >>>> Define and implement interfaces that allow vdpa devices to be preserved > >>>> across fork or exec, to support live update for applications such as QEMU. > >>>> The device must be suspended during the update, but its DMA mappings are > >>>> preserved, so the suspension is brief. > >>>> > >>>> The VHOST_NEW_OWNER ioctl transfers device ownership and pinned memory > >>>> accounting from one process to another. > >>>> > >>>> The VHOST_BACKEND_F_NEW_OWNER backend capability indicates that > >>>> VHOST_NEW_OWNER is supported. > >>>> > >>>> The VHOST_IOTLB_REMAP message type updates a DMA mapping with its userland > >>>> address in the new process. > >>>> > >>>> The VHOST_BACKEND_F_IOTLB_REMAP backend capability indicates that > >>>> VHOST_IOTLB_REMAP is supported and required. Some devices do not > >>>> require it, because the userland address of each DMA mapping is discarded > >>>> after being translated to a physical address. > >>>> > >>>> Here is a pseudo-code sequence for performing live update, based on > >>>> suspend + reset because resume is not yet widely available. The vdpa device > >>>> descriptor, fd, remains open across the exec. > >>>> > >>>> ioctl(fd, VHOST_VDPA_SUSPEND) > >>>> ioctl(fd, VHOST_VDPA_SET_STATUS, 0) > >>> > >>> I don't understand why we need a reset after suspend, it looks to me > >>> the previous suspend became meaningless. > >> > >> The suspend guarantees completion of in-progress DMA. At least, that is > >> my interpretation of why that is done for live migration in QEMU, which > >> also does suspend + reset + re-create. I am following the live migration > >> model. > > > > Yes, but any reason we need a reset after the suspension? > > Probably not. I found it cleanest to call reset and let new qemu configure the > device as it always does during startup, rather than altering those code paths > to skip the kernel calls. If we care about the downtime, I think avoiding a reset should be faster. > So, consider this to be just one of several possible > userland algorithms. > > - Steve Thanks > > >>>> exec > >>>> > >>>> ioctl(fd, VHOST_NEW_OWNER) > >>>> > >>>> issue ioctls to re-create vrings > >>>> > >>>> if VHOST_BACKEND_F_IOTLB_REMAP > >>> > >>> So the idea is for a device that is using a virtual address, it > >>> doesn't need VHOST_BACKEND_F_IOTLB_REMAP at all? > >> > >> Actually the reverse: if the device translates virtual to physical when > >> the mappings are created, and discards the virtual, then VHOST_IOTLB_REMAP > >> is not needed. > > > > Ok. > > > >> > >>>> foreach dma mapping > >>>> write(fd, {VHOST_IOTLB_REMAP, new_addr}) > >>>> > >>>> ioctl(fd, VHOST_VDPA_SET_STATUS, > >>>> ACKNOWLEDGE | DRIVER | FEATURES_OK | DRIVER_OK) > >>> > >>> From API level, this seems to be asymmetric as we have suspending but > >>> not resuming? > >> > >> Again, I am just following the path taken by live migration. > >> I will be happy to use resume when the devices and QEMU support it. > >> The decision to use reset vs resume should not affect the definition > >> and use of VHOST_NEW_OWNER and VHOST_IOTLB_REMAP. > >> > >> - Steve > > > > Thanks > > >