Re: Issue with "KVM: SEV: Add support for SEV intra host migration"

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Feb 16, 2023, Peter Gonda wrote:
> On Mon, Feb 13, 2023 at 1:07 PM Sagi Shahar <sagis@xxxxxxxxxx> wrote:
> >
> > TL;DR
> > Marking an SEV VM as dead after intra-host migration prevents cleanly tearing
> > down said VM.
> >
> > We are testing our POC code for TDX copyless migration and notice some
> > issues. We are currently using a similar scheme to the one used for
> > SEV where the VM is marked as dead after the migration is completed
> > which prevents any other IOCTLs from being triggered on the VM.
> >
> > From what we are seeing, there are at least 2 IOCTLs that VMM is
> > issuing on the source VM after the migration is completed. The first
> > one is KVM_IOEVENTFD for unwiring an eventfd used for the NVMe admin
> > queue during the NVMe device unplug sequence. The second IOCTL is
> > KVM_SET_USER_MEMORY_REGION for removing the memslots during VM
> > destruction. Failing any of these IOCTLs will cause the migration to
> > fail.

Does the VMM _need_ to cleanly teardown the source VM?  If so, why?

> > I can see 3 options:
> >
> > 1) If we want to keep the vm_dead logic as is, this means changing to
> > VMM code in some pretty hacky way. We will need to distinguish between
> > regular VM shutdown to VM shutdown after migration. We will also need
> > to make absolutely sure that we don't leave any dangling data in the
> > kernel by skipping some of the cleanup stages.
> >
> > 2) If we want to remove the vm_dead logic we can simply not mark the
> > vm as dead after migration. It looks like it will just work but might
> > create special cases where IOCTLs can be called on a TD which isn't
> > valid anymore. From what I can tell, some of these code paths are
> > already  protected by a check if hkid is assigned so it might not be a
> > big issue. Not sure how this will work for SEV but I'm guessing
> > there's a similar mechanism there as well.
> >
> > 3) We can also go half way and only block certain memory encryption
> > related IOCTLs if the VM got migrated. This will likely require more
> > changes when we try to push this ustream since it will require adding
> > a new field for vm_mem_enc_dead (or something similar) in addition to
> > the current vm_bugged and vm_dead.
> >
> > Personally, I don't want to go with option (1) since it sounds quite
> > risky to make these kind of changes without fully understanding all
> > the possible side effects.
> >
> > I prefer either option (2) or (3) but I don't know which one will be
> > more acceptable by the community.
> 
> I agree option 2 or 3 seem preferable. Option two sounds good to me, I
> am not sure why we needed to disable all IOCLTs on the source VM after
> the migration. I was just taking feedback on the review.

I don't like #2.  For all intents and purposes, the source VM _is_ dead, or at
least zombified.  It _was_ an SEV guest, but after migration it's no longer an
SEV guest, e.g. doesn't have a dedicated ASID, etc.  But the CPUID state and a
pile of register state won't be coherent, especially on SEV-ES where KVM doesn't
have visibility into guest state.

> We have the ASID similar to the HKID in SEV. I don't think the code
> paths are already protected like you mention TDX is but that seems
> like a simple enough fix. Or maybe it's better to introduce a new
> VM_MOVED like VM_BUGGED and VM_DEAD which allows most IOCTLs but just
> disables running vCPUs.

I kinda like the idea of a VM_MOVED flag, but I'm a bit leary of it from a
a maintenance and ABI perspective.  Definining and documenting what ioctls()
are/aren't allowed would get rather messy.  The beauty of VM_DEAD is that it's
all or nothing.

> What about option 4. Remove the VM_DEAD on migration and do nothing
> else.

I'm strongly against doing nothing.  It _might_ be safe from KVM's perspective,
but I would really prefer not to have to constantly think about whether or not a
given change is safe in the context of a zombified SEV guest.

> Userspace can easily make errors which cause the VM to be unusable. Does this
> error path really need support from KVM?

As above, I'm concerned with KVM's safety, not the guest's.

Depending on why the source VM needs to be cleaned up, one thought would be add
a dedicated ioctl(), e.g. KVM_DISMANTLE_VM, and make that the _only_ ioctl() that's
allowed to operate on a dead VM.  The ioctl() would be defined as a best-effort
mechanism to teardown/free internal state, e.g. destroy KVM_PIO_BUS, KVM_MMIO_BUS,
and memslots, zap all SPTEs, etc...



[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux