On Tue, 2022-06-28 at 14:52 +1200, Kai Huang wrote: > On Mon, 2022-06-27 at 14:53 -0700, isaku.yamahata@xxxxxxxxx wrote: > > From: Sean Christopherson <sean.j.christopherson@xxxxxxxxx> > > > > Unlike default VMs, confidential VMs (Intel TDX and AMD SEV-ES) don't allow > > some operations (e.g., memory read/write, register state access, etc). > > > > Introduce vm_type to track the type of the VM to x86 KVM. Other arch KVMs > > already use vm_type, KVM_INIT_VM accepts vm_type, and x86 KVM callback > > vm_init accepts vm_type. So follow them. Further, a different policy can > > be made based on vm_type. Define KVM_X86_DEFAULT_VM for default VM as > > default and define KVM_X86_TDX_VM for Intel TDX VM. The wrapper function > > will be defined as "bool is_td(kvm) { return vm_type == VM_TYPE_TDX; }" > > > > Add a capability KVM_CAP_VM_TYPES to effectively allow device model, > > e.g. qemu, to query what VM types are supported by KVM. This (introduce a > > new capability and add vm_type) is chosen to align with other arch KVMs > > that have VM types already. Other arch KVMs uses different name to query > > supported vm types and there is no common name for it, so new name was > > chosen. > > > > Co-developed-by: Xiaoyao Li <xiaoyao.li@xxxxxxxxx> > > Signed-off-by: Xiaoyao Li <xiaoyao.li@xxxxxxxxx> > > Signed-off-by: Sean Christopherson <sean.j.christopherson@xxxxxxxxx> > > Signed-off-by: Isaku Yamahata <isaku.yamahata@xxxxxxxxx> > > Reviewed-by: Paolo Bonzini <pbonzini@xxxxxxxxxx> > > --- > > Documentation/virt/kvm/api.rst | 21 +++++++++++++++++++++ > > arch/x86/include/asm/kvm-x86-ops.h | 1 + > > arch/x86/include/asm/kvm_host.h | 2 ++ > > arch/x86/include/uapi/asm/kvm.h | 3 +++ > > arch/x86/kvm/svm/svm.c | 6 ++++++ > > arch/x86/kvm/vmx/main.c | 1 + > > arch/x86/kvm/vmx/tdx.h | 6 +----- > > arch/x86/kvm/vmx/vmx.c | 5 +++++ > > arch/x86/kvm/vmx/x86_ops.h | 1 + > > arch/x86/kvm/x86.c | 9 ++++++++- > > include/uapi/linux/kvm.h | 1 + > > tools/arch/x86/include/uapi/asm/kvm.h | 3 +++ > > tools/include/uapi/linux/kvm.h | 1 + > > 13 files changed, 54 insertions(+), 6 deletions(-) > > > > diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst > > index 9cbbfdb663b6..b9ab598883b2 100644 > > --- a/Documentation/virt/kvm/api.rst > > +++ b/Documentation/virt/kvm/api.rst > > @@ -147,10 +147,31 @@ described as 'basic' will be available. > > The new VM has no virtual cpus and no memory. > > You probably want to use 0 as machine type. > > > > +X86: > > +^^^^ > > + > > +Supported vm type can be queried from KVM_CAP_VM_TYPES, which returns the > > +bitmap of supported vm types. The 1-setting of bit @n means vm type with > > +value @n is supported. > > > Perhaps I am missing something, but I don't understand how the below changes > (except the x86 part above) in Documentation are related to this patch. > > > + > > +S390: > > +^^^^^ > > + > > In order to create user controlled virtual machines on S390, check > > KVM_CAP_S390_UCONTROL and use the flag KVM_VM_S390_UCONTROL as > > privileged user (CAP_SYS_ADMIN). > > > > +MIPS: > > +^^^^^ > > + > > +To use hardware assisted virtualization on MIPS (VZ ASE) rather than > > +the default trap & emulate implementation (which changes the virtual > > +memory layout to fit in user mode), check KVM_CAP_MIPS_VZ and use the > > +flag KVM_VM_MIPS_VZ. > > + > > +ARM64: > > +^^^^^^ > > + > > On arm64, the physical address size for a VM (IPA Size limit) is limited > > to 40bits by default. The limit can be configured if the host supports the > > extension KVM_CAP_ARM_VM_IPA_SIZE. When supported, use > > diff --git a/arch/x86/include/asm/kvm-x86-ops.h b/arch/x86/include/asm/kvm-x86-ops.h > > index 75bc44aa8d51..a97cdb203a16 100644 > > --- a/arch/x86/include/asm/kvm-x86-ops.h > > +++ b/arch/x86/include/asm/kvm-x86-ops.h > > @@ -19,6 +19,7 @@ KVM_X86_OP(hardware_disable) > > KVM_X86_OP(hardware_unsetup) > > KVM_X86_OP(has_emulated_msr) > > KVM_X86_OP(vcpu_after_set_cpuid) > > +KVM_X86_OP(is_vm_type_supported) > > KVM_X86_OP(vm_init) > > KVM_X86_OP_OPTIONAL(vm_destroy) > > KVM_X86_OP_OPTIONAL_RET0(vcpu_precreate) > > diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h > > index aa11525500d3..089e0a4de926 100644 > > --- a/arch/x86/include/asm/kvm_host.h > > +++ b/arch/x86/include/asm/kvm_host.h > > @@ -1141,6 +1141,7 @@ enum kvm_apicv_inhibit { > > }; > > > > struct kvm_arch { > > + unsigned long vm_type; > > unsigned long n_used_mmu_pages; > > unsigned long n_requested_mmu_pages; > > unsigned long n_max_mmu_pages; > > @@ -1434,6 +1435,7 @@ struct kvm_x86_ops { > > bool (*has_emulated_msr)(struct kvm *kvm, u32 index); > > void (*vcpu_after_set_cpuid)(struct kvm_vcpu *vcpu); > > > > + bool (*is_vm_type_supported)(unsigned long vm_type); > > unsigned int vm_size; > > int (*vm_init)(struct kvm *kvm); > > void (*vm_destroy)(struct kvm *kvm); > > diff --git a/arch/x86/include/uapi/asm/kvm.h b/arch/x86/include/uapi/asm/kvm.h > > index 50a4e787d5e6..9792ec1cc317 100644 > > --- a/arch/x86/include/uapi/asm/kvm.h > > +++ b/arch/x86/include/uapi/asm/kvm.h > > @@ -531,4 +531,7 @@ struct kvm_pmu_event_filter { > > #define KVM_VCPU_TSC_CTRL 0 /* control group for the timestamp counter (TSC) */ > > #define KVM_VCPU_TSC_OFFSET 0 /* attribute for the TSC offset */ > > > > +#define KVM_X86_DEFAULT_VM 0 > > +#define KVM_X86_TDX_VM 1 > > + > > #endif /* _ASM_X86_KVM_H */ > > diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c > > index 247c0ad458a0..815a07c594f1 100644 > > --- a/arch/x86/kvm/svm/svm.c > > +++ b/arch/x86/kvm/svm/svm.c > > @@ -4685,6 +4685,11 @@ static void svm_vm_destroy(struct kvm *kvm) > > sev_vm_destroy(kvm); > > } > > > > +static bool svm_is_vm_type_supported(unsigned long type) > > +{ > > + return type == KVM_X86_DEFAULT_VM; > > +} > > + > > static int svm_vm_init(struct kvm *kvm) > > { > > if (!pause_filter_count || !pause_filter_thresh) > > @@ -4712,6 +4717,7 @@ static struct kvm_x86_ops svm_x86_ops __initdata = { > > .vcpu_free = svm_vcpu_free, > > .vcpu_reset = svm_vcpu_reset, > > > > + .is_vm_type_supported = svm_is_vm_type_supported, > > .vm_size = sizeof(struct kvm_svm), > > .vm_init = svm_vm_init, > > .vm_destroy = svm_vm_destroy, > > diff --git a/arch/x86/kvm/vmx/main.c b/arch/x86/kvm/vmx/main.c > > index ac788af17d92..7be4941e4c4d 100644 > > --- a/arch/x86/kvm/vmx/main.c > > +++ b/arch/x86/kvm/vmx/main.c > > @@ -43,6 +43,7 @@ struct kvm_x86_ops vt_x86_ops __initdata = { > > .hardware_disable = vmx_hardware_disable, > > .has_emulated_msr = vmx_has_emulated_msr, > > > > + .is_vm_type_supported = vmx_is_vm_type_supported, > > .vm_size = sizeof(struct kvm_vmx), > > .vm_init = vmx_vm_init, > > .vm_destroy = vmx_vm_destroy, > > diff --git a/arch/x86/kvm/vmx/tdx.h b/arch/x86/kvm/vmx/tdx.h > > index 54d7a26ed9ee..2f43db5bbefb 100644 > > --- a/arch/x86/kvm/vmx/tdx.h > > +++ b/arch/x86/kvm/vmx/tdx.h > > @@ -17,11 +17,7 @@ struct vcpu_tdx { > > > > static inline bool is_td(struct kvm *kvm) > > { > > - /* > > - * TDX VM type isn't defined yet. > > - * return kvm->arch.vm_type == KVM_X86_TDX_VM; > > - */ > > - return false; > > + return kvm->arch.vm_type == KVM_X86_TDX_VM; > > } > > If you put this patch before patch: > > [PATCH v7 009/102] KVM: TDX: Add placeholders for TDX VM/vcpu structure > > Then you don't need to introduce this chunk in above patch and then remove it > here, which is unnecessary and ugly. > > And you can even only introduce KVM_X86_DEFAULT_VM but not KVM_X86_TDX_VM in > this patch, so you can make this patch as a infrastructural patch to report VM > type. The KVM_X86_TDX_VM can come with the patch where is_td() is introduced > (in your above patch 9). > > To me, it's more clean way to write patch. For instance, this infrastructural > patch can be theoretically used by other series if they have similar thing to > support, but doesn't need to carry is_td() and KVM_X86_TDX_VM burden that you > made. Sorry I missed this patch already has Paolo's Reviewed-by. Please feel free to ignore my comments. -- Thanks, -Kai