> On 23 Nov 2019, at 2:22, Jim Mattson <jmattson@xxxxxxxxxx> wrote: > > On Fri, Nov 22, 2019 at 3:57 PM Liran Alon <liran.alon@xxxxxxxxxx> wrote: >> >> >>> On 23 Nov 2019, at 1:43, Jim Mattson <jmattson@xxxxxxxxxx> wrote: >>> >>> Commit 37e4c997dadf ("KVM: VMX: validate individual bits of guest >>> MSR_IA32_FEATURE_CONTROL") broke the KVM_SET_MSRS ABI by instituting >>> new constraints on the data values that kvm would accept for the guest >>> MSR, IA32_FEATURE_CONTROL. Perhaps these constraints should have been >>> opt-in via a new KVM capability, but they were applied >>> indiscriminately, breaking at least one existing hypervisor. >>> >>> Relax the constraints to allow either or both of >>> FEATURE_CONTROL_VMXON_ENABLED_OUTSIDE_SMX and >>> FEATURE_CONTROL_VMXON_ENABLED_INSIDE_SMX to be set when nVMX is >>> enabled. This change is sufficient to fix the aforementioned breakage. >>> >>> Fixes: 37e4c997dadf ("KVM: VMX: validate individual bits of guest MSR_IA32_FEATURE_CONTROL") >>> Signed-off-by: Jim Mattson <jmattson@xxxxxxxxxx> >> >> I suggest to also add a comment in code to clarify why we allow setting >> FEATURE_CONTROL_VMXON_ENABLED_INSIDE_SMX even though we expose a vCPU that doesn’t support Intel TXT. >> (I think the compatibility to existing workloads that sets this blindly on boot is a legit reason. Just recommend documenting it.) >> >> In addition, if the nested hypervisor which relies on this is public, please also mention it in commit message for reference. > > It's not an L1 hypervisor that's the problem. It's Google's L0 > hypervisor. We've been incorrectly reporting IA32_FEATURE_CONTROL as 7 > to nested guests for years, and now we have thousands of running VMs > with the bogus value. I've thought about just changing it to 5 on the > fly (on real hardware, one could almost blame it on SMM, but the MSR > is *locked*, after all). If I understand correctly, you are talking about the case a VM is already running and you will perform Live-Migration on it to a new host with a new kernel that it’s KVM don’t allow to set FEATURE_CONTROL_VMXON_ENABLED_INSIDE_SMX. If we haven’t encountered yet some guest workload that blindly sets this bit in IA32_FEATURE_CONTROL, then it should be sufficient for you to modify vmx_set_msr() to allow setting this bit in case msr_info->host_initiated (i.e. During restore of VM state on destination) but disallow it when WRMSR is initiated from guest. The behaviour of whether vmx_set_msr() allows host to set this MSR to unsupported can even be gated by an opt-in KVM cap that will be set by Google’s userspace VMM. That way, you won’t change change guest runtime semantics (it’s original locked MSR value will be preserved during Live-Migration), and you will disallow newly provisioned guests from setting IA32_FEATURE_CONTROL to an unsupported value. What do you think? -Liran