On 6/27/2023 4:50 AM, Sean Christopherson wrote:
On Mon, Jun 26, 2023, Weijiang Yang wrote:
On 6/24/2023 8:03 AM, Sean Christopherson wrote:
@@ -7322,6 +7331,19 @@ static fastpath_t vmx_vcpu_run(struct kvm_vcpu *vcpu)
kvm_wait_lapic_expire(vcpu);
+ /*
+ * Save host MSR_IA32_S_CET so that it can be reloaded at vm_exit.
+ * No need to save the other two vmcs fields as supervisor SHSTK
+ * are not enabled on Intel platform now.
+ */
+ if (IS_ENABLED(CONFIG_X86_KERNEL_IBT) &&
+ (vm_exit_controls_get(vmx) & VM_EXIT_LOAD_CET_STATE)) {
+ u64 msr;
+
+ rdmsrl(MSR_IA32_S_CET, msr);
Reading the MSR on every VM-Enter can't possibly be necessary. At the absolute
minimum, this could be moved outside of the fastpath; if the kernel modifies S_CET
from NMI context, KVM is hosed. And *if* S_CET isn't static post-boot, this can
be done in .prepare_switch_to_guest() so long as S_CET isn't modified from IRQ
context.
Agree with you.
But unless mine eyes deceive me, S_CET is only truly modified during setup_cet(),
i.e. is static post boot, which means it can be read once at KVM load time, e.g.
just like host_efer.
I think handling S_CET like host_efer from usage perspective is possible
given currently only
kernel IBT is enabled in kernel, I'll remove the code and initialize the
vmcs field once like host_efer.
The kernel does save/restore IBT when making BIOS calls, but if KVM is running a
vCPU across a BIOS call then we've got bigger issues.
What's the problem you're referring to?
I was pointing out that S_CET isn't strictly constant, as it's saved/modified/restored
by ibt_save() + ibt_restore(). But KVM should never run between those paired
functions, so from KVM's perspective the host value is effectively constant.
Yeah, so I think host S_CET setup can be handled as host_efer, thanks.
+ vmcs_writel(HOST_S_CET, msr);
+ }
+
/* The actual VMENTER/EXIT is in the .noinstr.text section. */
vmx_vcpu_enter_exit(vcpu, __vmx_vcpu_run_flags(vmx));
@@ -7735,6 +7757,13 @@ static void vmx_update_intercept_for_cet_msr(struct kvm_vcpu *vcpu)
incpt |= !guest_cpuid_has(vcpu, X86_FEATURE_SHSTK);
vmx_set_intercept_for_msr(vcpu, MSR_IA32_PL3_SSP, MSR_TYPE_RW, incpt);
+
+ /*
+ * If IBT is available to guest, then passthrough S_CET MSR too since
+ * kernel IBT is already in mainline kernel tree.
+ */
+ incpt = !guest_cpuid_has(vcpu, X86_FEATURE_IBT);
+ vmx_set_intercept_for_msr(vcpu, MSR_IA32_S_CET, MSR_TYPE_RW, incpt);
}
static void vmx_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu)
@@ -7805,7 +7834,7 @@ static void vmx_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu)
/* Refresh #PF interception to account for MAXPHYADDR changes. */
vmx_update_exception_bitmap(vcpu);
- if (kvm_cet_user_supported())
+ if (kvm_cet_user_supported() || kvm_cpu_cap_has(X86_FEATURE_IBT))
Yeah, kvm_cet_user_supported() simply looks wrong.
These are preconditions to set up CET MSRs for guest, in
vmx_update_intercept_for_cet_msr(),
the actual MSR control is based on guest_cpuid_has() results.
I know. My point is that with the below combination,
kvm_cet_user_supported() = true
kvm_cpu_cap_has(X86_FEATURE_IBT) = false
guest_cpuid_has(vcpu, X86_FEATURE_IBT) = true
KVM will passthrough MSR_IA32_S_CET for guest IBT even though IBT isn't supported
on the host.
incpt = !guest_cpuid_has(vcpu, X86_FEATURE_IBT);
vmx_set_intercept_for_msr(vcpu, MSR_IA32_S_CET, MSR_TYPE_RW, incpt);
So either KVM is broken and is passing through S_CET when it shouldn't, or the
check on kvm_cet_user_supported() is redundant, i.e. the above combination is
impossible.
Either way, the code *looks* wrong, which is almost as bad as it being functionally
wrong.
Got your point, I'll refine related code to make the handling reasonable.