On 10/26/2024 12:24 AM, Marcelo Tosatti wrote:
On Mon, Oct 14, 2024 at 08:17:19PM +0530, Nikunj A. Dadhania wrote:
Hi Isaku,
On 10/12/2024 1:25 PM, Isaku Yamahata wrote:
This patch series is for the kvm-coco-queue branch. The change for TDX KVM is
included at the last. The test is done by create TDX vCPU and run, get TSC
offset via vCPU device attributes and compare it with the TDX TSC OFFSET
metadata. Because the test requires the TDX KVM and TDX KVM kselftests, don't
include it in this patch series.
Background
----------
X86 confidential computing technology defines protected guest TSC so that the
VMM can't change the TSC offset/multiplier once vCPU is initialized and the
guest can trust TSC. The SEV-SNP defines Secure TSC as optional. TDX mandates
it. The TDX module determines the TSC offset/multiplier. The VMM has to
retrieve them.
On the other hand, the x86 KVM common logic tries to guess or adjust the TSC
offset/multiplier for better guest TSC and TSC interrupt latency at KVM vCPU
creation (kvm_arch_vcpu_postcreate()), vCPU migration over pCPU
(kvm_arch_vcpu_load()), vCPU TSC device attributes (kvm_arch_tsc_set_attr()) and
guest/host writing to TSC or TSC adjust MSR (kvm_set_msr_common()).
Problem
-------
The current x86 KVM implementation conflicts with protected TSC because the
VMM can't change the TSC offset/multiplier. Disable or ignore the KVM
logic to change/adjust the TSC offset/multiplier somehow.
Because KVM emulates the TSC timer or the TSC deadline timer with the TSC
offset/multiplier, the TSC timer interrupts are injected to the guest at the
wrong time if the KVM TSC offset is different from what the TDX module
determined.
Originally the issue was found by cyclic test of rt-test [1] as the latency in
TDX case is worse than VMX value + TDX SEAMCALL overhead. It turned out that
the KVM TSC offset is different from what the TDX module determines.
Can you provide what is the exact command line to reproduce this problem ?
Nikunj,
Run cyclictest, on an isolated CPU, in a VM. For the maximum latency
metric, rather than 50us, one gets 500us at times.
Any links to this reported issue ?
This was not posted publically. But its not hard to reproduce.
Solution
--------
The solution is to keep the KVM TSC offset/multiplier the same as the value of
the TDX module somehow. Possible solutions are as follows.
- Skip the logic
Ignore (or don't call related functions) the request to change the TSC
offset/multiplier.
Pros
- Logically clean. This is similar to the guest_protected case.
Cons
- Needs to identify the call sites.
- Revert the change at the hooks after TSC adjustment
x86 KVM defines the vendor hooks when the TSC offset/multiplier are
changed. The callback can revert the change.
Pros
- We don't need to care about the logic to change the TSC offset/multiplier.
Cons:
- Hacky to revert the KVM x86 common code logic.
Choose the first one. With this patch series, SEV-SNP secure TSC can be
supported.
I am not sure how will this help SNP Secure TSC, as the GUEST_TSC_OFFSET and
GUEST_TSC_SCALE are only available to the guest.
Nikunj,
FYI:
SEV-SNP processors (at least the one below) do not seem affected by this problem.
Did you apply Secure TSC patches of (guest kernel, KVM and QEMU)
manualy? because none of them are merged. Otherwise, I think SNP guest
is still using KVM emulated TSC.
At least this one:
vendor_id : AuthenticAMD
cpu family : 25
model : 17
model name : AMD EPYC 9124 16-Core Processor