Please don't top post. https://people.kernel.org/tglx/notes-about-netiquette On Wed, Oct 25, 2023, Yifei Ma wrote: > > On Oct 24, 2023, at 10:33 AM, Sean Christopherson <seanjc@xxxxxxxxxx> wrote: > > > > On Tue, Oct 24, 2023, Yifei Ma wrote: > >> Hi KVM community, > >> > >> I am trying to figure out how TSC is virtualized in KVM-VMX world. > >> According to the kernel documentation, reading TSC register through MSR > >> can be trapped into KVM and VMX. I am trying to figure out the KVM code > >> handing this trap. > > > > Key word "can". KVM chooses not to intercept RDMSR to MSR_IA32_TSC because > > hardware handles the necessary offset and scaling. KVM does still emulate reads > > in kvm_get_msr_common(), e.g. if KVM is forced to emulate a RDMSR, but that's a > > very, very uncommon path. > > > > Ditto for the RDTSC instruction, which isn't subject to MSR intercpetion bitmaps > > and has a dedicated control. KVM will emulate RDTSC if KVM is already emulating, > > but otherwise the guest can execute RDTSC without triggering a VM-Exit. > > > > Modern CPUs provide both a offset and a scaling factor for VMX guests, i.e. the > > CPU itself virtualizes guest TSC. See the RDMSR and RDTSC bullet points in the > > "CHANGES TO INSTRUCTION BEHAVIOR IN VMX NON-ROOT OPERATION" section of the SDM > > for details. > > I went through the SDM virtual machine extensions chapter, and some KVM > patches and it helped me a lot. My understanding is: > > If the RDTSC existing flied in the VMCS is not set, then the rdtsc from > non-root model won’t cause VM-exit. In this case, the TSC returned to > non-root is the value of the physical TSC * scaling + offset, if scaling and > offset are set by KVM. Yes. Note, if hardware supports TSC offsetting and/or TSC scaling, they are enabled by KVM. KVM simply uses an initial offset of '0' and a multiplier that makes the guest TSC "run" at the same frequency as the host. > The TSC offset and scaling of a vCPU can be set from root-mode through KVM > APIs using command KVM_VCPU_TSC_CTRL & KVM_SET_TSC_KHZ , and they are written > to the vCPU’s VMCS fields. Next time, non-root mode calls rdtsc, the VMX > hardware will add the offset & scaling to the physical TSC. Yes, with caveats. The guest can write MSR_IA32_TSC and/or MSR_IA32_TSC_ADJUST, which KVM emulates by modifying TSC_OFFSET. If the CPU doesn't have a constant TSC, KVM will adjust TSC_OFFSET before the next VM-Enter to try and keep guest TSC consistent and monotonic. If the CPU doesn't support TSC scaling, KVM will manually scale the guest TSC prior to every VM-Enter by again adjusting TSC_OFFSET to "catch up" to what the guest TSC _should_ be given the guest TSC frequency.