This patch set is part-2 of this RFC patches. It introduces CONFIG_PKVM_INTEL, and do a deprivilege for native run Linux, to a host VM. Host Linux must be trusted until pKVM got boot up, so pKVM shall be boot as early as possible. In addition, pKVM can not support deinit (e.g., return host to root mode after it got deprivileged to a VM). These disallow build KVM module (pKVM binary embedded) as a dynamic loaded module. Thus after enable CONFIG_PKVM_INTEL, the KVM module is built-in, and the Linux system running on bare-metal prepares environment to run each pCPU into vmx non-root mode - this means all pCPUs in native Linux are deprivileged to vCPUs, and the native Linux system is deprivileged to a host VM. Meanwhile pKVM is kept running under vmx root mode with an independent binary. Host VM almost own all the hardware resources except modules owned by pKVM, like VMX, EPT and IOMMU; pKVM shall fully own VMX, EPT and IOMMU to ensure the isolation of protected VMs. This patch set did some initial works for above: - pKVM manages the VMX to do the host deprivilege. But there is no VMX emulation to host VM, so host VM can not run its guests base on pKVM yet. - The EPT is disabled for host VM as no host EPT page table created in pKVM yet. And there is no MMU page table setup for pKVM either, so it reuses native Linux Kernel's CR3 page table now. - The IOMMU is not touched by pKVM yet and directly pass-thru to host VM. And this patch set also build pKVM as an independent binary, which make pKVM compiled with separated sections, and add prefix __pkvm for all its symbols to ensure pKVM & host Linux will not touch each other's symbols. This help to do the address space isolation for pKVM and host VM in the next patch series. The future patch sets shall create pKVM its own CR3 page table, enable memory isolation based on EPT page table, provide emulation of VMCS and EPT for host VM, and enable DMA protection based on IOMMU & its emulation for host VM (Not in this RFC). Jason Chen CJ (16): pkvm: x86: Introduce CONFIG_PKVM_INTEL KVM: VMX: Refactor for setup_vmcs_config pkvm: x86: Add vmx capability check and vmx config setup pkvm: x86: Add pCPU env setup pkvm: x86: Add basic setup for host vcpu pkvm: x86: Introduce pkvm_host_deprivilege_cpus pkvm: x86: Allocate vmcs and msr bitmap pages for host vcpu pkvm: x86: Initailize vmcs guest state area for host vcpu pkvm: x86: Initialize vmcs host state area for host vcpu pkvm: x86: Initialize vmcs control fields for host vcpu pkvm: x86: Define empty debug functions for hypervisor pkvm: x86: Add vmexit handler for host vcpu pkvm: x86: Add private vmx_ops.h for pKVM pkvm: x86: Add pKVM retpoline.S pkvm: x86: Build pKVM runtime as an independent binary pkvm: x86: Deprivilege host OS Zide Chen (1): pkvm: x86: Stub CONFIG_DEBUG_LIST in pKVM arch/x86/include/asm/kvm_host.h | 1 + arch/x86/include/asm/pkvm_image.h | 42 ++ arch/x86/kernel/vmlinux.lds.S | 32 + arch/x86/kvm/Kconfig | 13 + arch/x86/kvm/Makefile | 1 + arch/x86/kvm/vmx/pkvm/.gitignore | 1 + arch/x86/kvm/vmx/pkvm/Makefile | 9 + arch/x86/kvm/vmx/pkvm/hyp/Makefile | 40 ++ arch/x86/kvm/vmx/pkvm/hyp/debug.h | 13 + arch/x86/kvm/vmx/pkvm/hyp/lib/list_debug.c | 17 + arch/x86/kvm/vmx/pkvm/hyp/lib/retpoline.S | 113 ++++ arch/x86/kvm/vmx/pkvm/hyp/pkvm.lds.S | 10 + arch/x86/kvm/vmx/pkvm/hyp/vmexit.c | 154 +++++ arch/x86/kvm/vmx/pkvm/hyp/vmexit.h | 11 + arch/x86/kvm/vmx/pkvm/hyp/vmx_asm.S | 186 ++++++ arch/x86/kvm/vmx/pkvm/hyp/vmx_ops.h | 185 ++++++ arch/x86/kvm/vmx/pkvm/include/pkvm.h | 54 ++ arch/x86/kvm/vmx/pkvm/pkvm_host.c | 728 +++++++++++++++++++++ arch/x86/kvm/vmx/vmx.c | 122 ++-- arch/x86/kvm/vmx/vmx.h | 22 + arch/x86/kvm/vmx/vmx_ops.h | 7 + arch/x86/kvm/x86.c | 5 + include/asm-generic/vmlinux.lds.h | 17 + 23 files changed, 1736 insertions(+), 47 deletions(-) create mode 100644 arch/x86/include/asm/pkvm_image.h create mode 100644 arch/x86/kvm/vmx/pkvm/.gitignore create mode 100644 arch/x86/kvm/vmx/pkvm/Makefile create mode 100644 arch/x86/kvm/vmx/pkvm/hyp/Makefile create mode 100644 arch/x86/kvm/vmx/pkvm/hyp/debug.h create mode 100644 arch/x86/kvm/vmx/pkvm/hyp/lib/list_debug.c create mode 100644 arch/x86/kvm/vmx/pkvm/hyp/lib/retpoline.S create mode 100644 arch/x86/kvm/vmx/pkvm/hyp/pkvm.lds.S create mode 100644 arch/x86/kvm/vmx/pkvm/hyp/vmexit.c create mode 100644 arch/x86/kvm/vmx/pkvm/hyp/vmexit.h create mode 100644 arch/x86/kvm/vmx/pkvm/hyp/vmx_asm.S create mode 100644 arch/x86/kvm/vmx/pkvm/hyp/vmx_ops.h create mode 100644 arch/x86/kvm/vmx/pkvm/include/pkvm.h create mode 100644 arch/x86/kvm/vmx/pkvm/pkvm_host.c -- 2.25.1