This patch series is for the kvm-coco-queue branch. The change for TDX KVM is included at the last. The test is done by create TDX vCPU and run, get TSC offset via vCPU device attributes and compare it with the TDX TSC OFFSET metadata. Because the test requires the TDX KVM and TDX KVM kselftests, don't include it in this patch series. Background ---------- X86 confidential computing technology defines protected guest TSC so that the VMM can't change the TSC offset/multiplier once vCPU is initialized and the guest can trust TSC. The SEV-SNP defines Secure TSC as optional. TDX mandates it. The TDX module determines the TSC offset/multiplier. The VMM has to retrieve them. On the other hand, the x86 KVM common logic tries to guess or adjust the TSC offset/multiplier for better guest TSC and TSC interrupt latency at KVM vCPU creation (kvm_arch_vcpu_postcreate()), vCPU migration over pCPU (kvm_arch_vcpu_load()), vCPU TSC device attributes (kvm_arch_tsc_set_attr()) and guest/host writing to TSC or TSC adjust MSR (kvm_set_msr_common()). Problem ------- The current x86 KVM implementation conflicts with protected TSC because the VMM can't change the TSC offset/multiplier. Disable or ignore the KVM logic to change/adjust the TSC offset/multiplier somehow. Because KVM emulates the TSC timer or the TSC deadline timer with the TSC offset/multiplier, the TSC timer interrupts are injected to the guest at the wrong time if the KVM TSC offset is different from what the TDX module determined. Originally the issue was found by cyclic test of rt-test [1] as the latency in TDX case is worse than VMX value + TDX SEAMCALL overhead. It turned out that the KVM TSC offset is different from what the TDX module determines. Solution -------- The solution is to keep the KVM TSC offset/multiplier the same as the value of the TDX module somehow. Possible solutions are as follows. - Skip the logic Ignore (or don't call related functions) the request to change the TSC offset/multiplier. Pros - Logically clean. This is similar to the guest_protected case. Cons - Needs to identify the call sites. - Revert the change at the hooks after TSC adjustment x86 KVM defines the vendor hooks when the TSC offset/multiplier are changed. The callback can revert the change. Pros - We don't need to care about the logic to change the TSC offset/multiplier. Cons: - Hacky to revert the KVM x86 common code logic. Choose the first one. With this patch series, SEV-SNP secure TSC can be supported. Patches: 1: Preparation for the next patch 2: Skip the logic to adjust the TSC offset/multiplier in the common x86 KVM logic [1] https://git.kernel.org/pub/scm/utils/rt-tests/rt-tests.git Changes for TDX KVM diff --git a/arch/x86/kvm/vmx/tdx.c b/arch/x86/kvm/vmx/tdx.c index 8785309ccb46..969da729d89f 100644 --- a/arch/x86/kvm/vmx/tdx.c +++ b/arch/x86/kvm/vmx/tdx.c @@ -694,8 +712,6 @@ int tdx_vcpu_create(struct kvm_vcpu *vcpu) vcpu->arch.cr0_guest_owned_bits = -1ul; vcpu->arch.cr4_guest_owned_bits = -1ul; - vcpu->arch.tsc_offset = kvm_tdx->tsc_offset; - vcpu->arch.l1_tsc_offset = vcpu->arch.tsc_offset; /* * TODO: support off-TD debug. If TD DEBUG is enabled, guest state * can be accessed. guest_state_protected = false. and kvm ioctl to @@ -706,6 +722,13 @@ int tdx_vcpu_create(struct kvm_vcpu *vcpu) */ vcpu->arch.guest_state_protected = true; + /* VMM can't change TSC offset/multiplier as TDX module manages them. */ + vcpu->arch.guest_tsc_protected = true; + vcpu->arch.tsc_offset = kvm_tdx->tsc_offset; + vcpu->arch.l1_tsc_offset = vcpu->arch.tsc_offset; + vcpu->arch.tsc_scaling_ratio = kvm_tdx->tsc_multiplier; + vcpu->arch.l1_tsc_scaling_ratio = kvm_tdx->tsc_multiplier; + if ((kvm_tdx->xfam & XFEATURE_MASK_XTILE) == XFEATURE_MASK_XTILE) vcpu->arch.xfd_no_write_intercept = true; @@ -2674,6 +2697,7 @@ static int tdx_td_init(struct kvm *kvm, struct kvm_tdx_cmd *cmd) goto out; kvm_tdx->tsc_offset = td_tdcs_exec_read64(kvm_tdx, TD_TDCS_EXEC_TSC_OFFSET); + kvm_tdx->tsc_multiplier = td_tdcs_exec_read64(kvm_tdx, TD_TDCS_EXEC_TSC_MULTIPLIER); kvm_tdx->attributes = td_params->attributes; kvm_tdx->xfam = td_params->xfam; diff --git a/arch/x86/kvm/vmx/tdx.h b/arch/x86/kvm/vmx/tdx.h index 614b1c3b8483..c0e4fa61cab1 100644 --- a/arch/x86/kvm/vmx/tdx.h +++ b/arch/x86/kvm/vmx/tdx.h @@ -42,6 +42,7 @@ struct kvm_tdx { bool tsx_supported; u64 tsc_offset; + u64 tsc_multiplier; enum kvm_tdx_state state; diff --git a/arch/x86/kvm/vmx/tdx_arch.h b/arch/x86/kvm/vmx/tdx_arch.h index 861c0f649b69..be4cf65c90a8 100644 --- a/arch/x86/kvm/vmx/tdx_arch.h +++ b/arch/x86/kvm/vmx/tdx_arch.h @@ -69,6 +69,7 @@ enum tdx_tdcs_execution_control { TD_TDCS_EXEC_TSC_OFFSET = 10, + TD_TDCS_EXEC_TSC_MULTIPLIER = 11, }; enum tdx_vcpu_guest_other_state { --- Isaku Yamahata (2): KVM: x86: Push down setting vcpu.arch.user_set_tsc KVM: x86: Don't allow tsc_offset, tsc_scaling_ratio to change arch/x86/include/asm/kvm_host.h | 1 + arch/x86/kvm/x86.c | 21 ++++++++++++++------- 2 files changed, 15 insertions(+), 7 deletions(-) base-commit: 909f9d422f59f863d7b6e4e2c6e57abb97a27d4d -- 2.45.2