Re: [RFC PATCH v6 00/36] KVM: x86: eVMCS rework

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Sean Christopherson <seanjc@xxxxxxxxxx> writes:

> On Thu, Aug 25, 2022, Vitaly Kuznetsov wrote:
>> Sean Christopherson <seanjc@xxxxxxxxxx> writes:
>> 
>> > This is what I ended up with as a way to dig ourselves out of the eVMCS
>> > conundrum.  Not well tested, though KUT and selftests pass.  The enforcement
>> > added by "KVM: nVMX: Enforce unsupported eVMCS in VMX MSRs for host accesses"
>> > is not tested at all (and lacks a changelog).
>> 
>> Trying to enable KVM_CAP_HYPERV_ENLIGHTENED_VMCS2 in its new shape in
>> QEMU so I can test it and I immediately stumble upon
>> 
>> ~/qemu/build/qemu-system-x86_64 -machine q35,accel=kvm,kernel-irqchip=split -cpu host,hv-evmcs-2022,hv-evmcs,hv-vpindex,hv-vapic 
>> qemu-system-x86_64: error: failed to set MSR 0x48d to 0xff00000016
>> qemu-system-x86_64: ../target/i386/kvm/kvm.c:3107: kvm_buf_set_msrs: Assertion `ret == cpu->kvm_msr_buf->nmsrs' failed.
>> 
>> Turns out, at least with "-cpu host" QEMU reads VMX feature MSRs first
>> and enables eVMCS after.
>
> Heh, of course there had to be a corner case.
>

Unfortunatelly, it's not a corner case, named CPU models in QEMU behave
exactly the same (I've just forgotten to add '+vmx' yesterday). In fact,
it seems QEMU uses system-wide KVM_GET_MSRS (which results in
vmx_get_msr_feature() for our case) which gives unfiltered values. As it
is system wide it just can't filter anything. This happens even before
KVM_CREATE_VCPU is called so switching to per-vCPU ioctl is not an
option. What's worse is that all the discovered features (including VMX
features) are passed to upper layers of the virtualization stack,
starting with libvirt and upper layers may want to enable some of the
"available" features explicitly. Teaching everyone what's available with
eVMCS and what's not seems to be a hard task.

This use-case can probably be solved by making eVMCS enablement a per-VM
thing (already did locally) and creating a per-VM version of
KVM_GET_MSRS which will give us filtered VMX MSRs when eVMCS was
enabled.

Note: silently filtering out features when vCPUs are created is bad as
the list of such features will change over time. This is guaranteed to
break migrations.

Honestly I'm starting to think the 'evmcs revisions' idea (to keep
the exact list of features in KVM and update them every couple years
when new Hyper-V releases) is easier. It's just a list, it doesn't
require much. The main downside, as was already named, is that userspace
VMM doesn't see which VMX features are actually passed to the guest
unless it is also taught about these "evmcs revisions" (more than what's
the latest number available). This, to certain extent, can probably be
solved by VMM itself by doing KVM_GET_MSRS after vCPU is created (this
won't help much with feature discovery by upper layers, tough). This,
however, is a new use-case, unsupported with the current
KVM_CAP_HYPERV_ENLIGHTENED_VMCS implementation.

eVMCS seems to be special in a way that a) it evolves over time b) it is
mutually exclusive with *some* other features but the list changes. We
don't seem to have anything like that in KVM/QEMU, thus all the
confusion.

-- 
Vitaly




[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux