On Thu, Feb 16, 2023 at 11:00 AM Sean Christopherson <seanjc@xxxxxxxxxx> wrote: > > On Thu, Feb 16, 2023, Paolo Bonzini wrote: > > On 2/16/23 19:18, Sean Christopherson wrote: > > > Depending on why the source VM needs to be cleaned up, one thought would be add > > > a dedicated ioctl(), e.g. KVM_DISMANTLE_VM, and make that the _only_ ioctl() that's > > > allowed to operate on a dead VM. The ioctl() would be defined as a best-effort > > > mechanism to teardown/free internal state, e.g. destroy KVM_PIO_BUS, KVM_MMIO_BUS, > > > and memslots, zap all SPTEs, etc... > > > > If we have to write the code we might as well do it directly at context-move > > time, couldn't we? I like the idea of minimizing the memory cost of the > > zombie VM. > > I thought about that too, but I assume the teardown latency would be non-trivial, > especially if KVM aggressively purges state. The VMM can't resume the VM until > it knows the migration has completed, so immediately doing the cleanup would impact > the VM's blackout time. > > My other thought was to automatically do the cleanup, but to do so asynchronously. > I actually don't know why I discarded that idea. I think I got distracted and > forgot to circle back. That might be something we could do for any and all > bugged/dead VMs. As my second experiment where I always return success on ioctls after the vm is marked dead, my guess is that VMM doesn't HAVE to clean the state using the ioctls. Our VMM currently cleans everything automatically during the shutdown process of the vm. Changing this behavior might be a bit tricky if if it's safer than allowing cleanup ioctls in KVM we can see if we can change that behavior. I like Paul's suggestion on ignoring ioctl errors if the vm got migrated in VMM. This might be enough for us to clean up the source VM state enough that we can safely release it.