On Fri, Nov 12, 2021, Xiaoyao Li wrote: > From: Sean Christopherson <sean.j.christopherson@xxxxxxxxx> > > MCE is not supported for TDX VM and KVM cannot inject #MC to TDX VM. > > Introduce kvm_guest_mce_disallowed() which actually reports the MCE > availability based on vm_type. And use it to guard all the MCE related > CAPs and IOCTLs. > > Note: KVM_X86_GET_MCE_CAP_SUPPORTED is KVM scope so that what it reports > may not match the behavior of specific VM (e.g., here for TDX VM). The > same for KVM_CAP_MCE when queried from /dev/kvm. To qeuery the precise > KVM_CAP_MCE of the VM, it should use VM's fd. > > [ Xiaoyao: Guard MCE related CAPs ] > > Co-developed-by: Kai Huang <kai.huang@xxxxxxxxxxxxxxx> > Signed-off-by: Kai Huang <kai.huang@xxxxxxxxxxxxxxx> > Signed-off-by: Sean Christopherson <sean.j.christopherson@xxxxxxxxx> > Signed-off-by: Xiaoyao Li <xiaoyao.li@xxxxxxxxx> > --- > arch/x86/kvm/x86.c | 10 ++++++++++ > arch/x86/kvm/x86.h | 5 +++++ > 2 files changed, 15 insertions(+) > > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c > index b02088343d80..2b21c5169f32 100644 > --- a/arch/x86/kvm/x86.c > +++ b/arch/x86/kvm/x86.c > @@ -4150,6 +4150,8 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext) > break; > case KVM_CAP_MCE: > r = KVM_MAX_MCE_BANKS; > + if (kvm) > + r = kvm_guest_mce_disallowed(kvm) ? 0 : r; r = KVM_MAX_MCE_BANKS; if (kvm && kvm_guest_mce_disallowed(kvm)) r = 0; or r = (kvm && kvm_guest_mce_disallowed(kvm)) ? 0 : KVM_MAX_MCE_BANKS; > break; > case KVM_CAP_XCRS: > r = boot_cpu_has(X86_FEATURE_XSAVE); > @@ -5155,6 +5157,10 @@ long kvm_arch_vcpu_ioctl(struct file *filp, > case KVM_X86_SETUP_MCE: { > u64 mcg_cap; > > + r = EINVAL; > + if (kvm_guest_mce_disallowed(vcpu->kvm)) > + goto out; > + > r = -EFAULT; > if (copy_from_user(&mcg_cap, argp, sizeof(mcg_cap))) > goto out; > @@ -5164,6 +5170,10 @@ long kvm_arch_vcpu_ioctl(struct file *filp, > case KVM_X86_SET_MCE: { > struct kvm_x86_mce mce; > > + r = EINVAL; > + if (kvm_guest_mce_disallowed(vcpu->kvm)) > + goto out; > + > r = -EFAULT; > if (copy_from_user(&mce, argp, sizeof(mce))) > goto out; > diff --git a/arch/x86/kvm/x86.h b/arch/x86/kvm/x86.h > index a2813892740d..69c60297bef2 100644 > --- a/arch/x86/kvm/x86.h > +++ b/arch/x86/kvm/x86.h > @@ -441,6 +441,11 @@ static __always_inline bool kvm_irq_injection_disallowed(struct kvm_vcpu *vcpu) > return vcpu->kvm->arch.vm_type == KVM_X86_TDX_VM; > } > > +static __always_inline bool kvm_guest_mce_disallowed(struct kvm *kvm) The "guest" part is potentially confusing and incosistent with e.g. kvm_irq_injection_disallowed. And given the current ridiculous spec, CR4.MCE=1 is _required_, so saying "mce disallowed" is arguably wrong from that perspective. kvm_mce_injection_disallowed() would be more appropriate. > +{ > + return kvm->arch.vm_type == KVM_X86_TDX_VM; > +} > + > void kvm_load_guest_xsave_state(struct kvm_vcpu *vcpu); > void kvm_load_host_xsave_state(struct kvm_vcpu *vcpu); > int kvm_spec_ctrl_test_value(u64 value); > -- > 2.27.0 >