On Thu, Jan 18, 2024 at 4:32 AM Steven Sistare <steven.sistare@xxxxxxxxxx> wrote: > > On 1/10/2024 9:55 PM, Jason Wang wrote: > > On Thu, Jan 11, 2024 at 4:40 AM Steve Sistare <steven.sistare@xxxxxxxxxx> wrote: > >> > >> Live update is a technique wherein an application saves its state, exec's > >> to an updated version of itself, and restores its state. Clients of the > >> application experience a brief suspension of service, on the order of > >> 100's of milliseconds, but are otherwise unaffected. > >> > >> Define and implement interfaces that allow vdpa devices to be preserved > >> across fork or exec, to support live update for applications such as qemu. > >> The device must be suspended during the update, but its dma mappings are > >> preserved, so the suspension is brief. > >> > >> The VHOST_NEW_OWNER ioctl transfers device ownership and pinned memory > >> accounting from one process to another. > >> > >> The VHOST_BACKEND_F_NEW_OWNER backend capability indicates that > >> VHOST_NEW_OWNER is supported. > >> > >> The VHOST_IOTLB_REMAP message type updates a dma mapping with its userland > >> address in the new process. > >> > >> The VHOST_BACKEND_F_IOTLB_REMAP backend capability indicates that > >> VHOST_IOTLB_REMAP is supported and required. Some devices do not > >> require it, because the userland address of each dma mapping is discarded > >> after being translated to a physical address. > >> > >> Here is a pseudo-code sequence for performing live update, based on > >> suspend + reset because resume is not yet available. The vdpa device > >> descriptor, fd, remains open across the exec. > >> > >> ioctl(fd, VHOST_VDPA_SUSPEND) > >> ioctl(fd, VHOST_VDPA_SET_STATUS, 0) > >> exec > > > > Is there a userspace implementation as a reference? > > I have working patches for qemu that use these ioctl's, but they depend on other > qemu cpr patches that are a work in progress, and not posted yet. I'm working on > that. Ok. > > >> ioctl(fd, VHOST_NEW_OWNER) > >> > >> issue ioctls to re-create vrings > >> > >> if VHOST_BACKEND_F_IOTLB_REMAP > >> foreach dma mapping > >> write(fd, {VHOST_IOTLB_REMAP, new_addr}) > > > > I think I need to understand the advantages of this approach. For > > example, why it is better than > > > > ioctl(VHOST_RESET_OWNER) > > exec > > > > ioctl(VHOST_SET_OWNER) > > > > for each dma mapping > > ioctl(VHOST_IOTLB_UPDATE) > > That is slower. VHOST_RESET_OWNER unbinds physical pages, and VHOST_IOTLB_UPDATE > rebinds them. It costs multiple seconds for large memories, and is incurred during the > virtual machine's pause time during live update. For comparison, the total pause time > for live update with vfio interfaces is ~100 millis. > > However, the interaction with userland is so similar that the same code paths can be used. > In my qemu prototype, after cpr exec's new qemu: > - vhost_vdpa_set_owner() calls VHOST_NEW_OWNER instead of VHOST_SET_OWNER > - vhost_vdpa_dma_map() sets type VHOST_IOTLB_REMAP instead of VHOST_IOTLB_UPDATE > > - Steve > Ok, let's document this in the changlog. Thanks