On Mon, 2023-11-13 at 20:35 -0800, isaku.yamahata@xxxxxxxxx wrote: > From: Isaku Yamahata <isaku.yamahata@xxxxxxxxx> > > Add KVM_CAP_X86_BUS_FREQUENCY_CONTROL capability to configure the core > crystal clock (or processor's bus clock) for APIC timer emulation. Allow > KVM_ENABLE_CAPABILITY(KVM_CAP_X86_BUS_FREQUENCY_CONTROL) to set the > frequency. > > TDX virtualizes CPUID[0x15] for the core crystal clock to be 25MHz. The > x86 KVM hardcodes its frequency for APIC timer to be 1GHz. This mismatch > causes the vAPIC timer to fire earlier than the guest expects. [1] The KVM > APIC timer emulation uses hrtimer, whose unit is nanosecond. Make the > parameter configurable for conversion from the TMICT value to nanosecond. > > This patch doesn't affect the TSC deadline timer emulation. The TSC > deadline emulation path records its expiring TSC value and calculates the > expiring time in nanoseconds. The APIC timer emulation path calculates the > TSC value from the TMICT register value and uses the TSC deadline timer > path. This patch touches the APIC timer-specific code but doesn't touch > common logic. Nitpick: To be honest IMHO the paragraph about tsc deadline is redundant, because by definition (x86 spec) the tsc deadline timer doesn't use APIC bus frequency. Reviewed-by: Maxim Levitsky <mlevitsk@xxxxxxxxxx> Best regards, Maxim Levitsky > > [1] https://lore.kernel.org/lkml/20231006011255.4163884-1-vannapurve@xxxxxxxxxx/ > Reported-by: Vishal Annapurve <vannapurve@xxxxxxxxxx> > Signed-off-by: Isaku Yamahata <isaku.yamahata@xxxxxxxxx> > --- > Changes v2: > - Add check if vcpu isn't created. > - Add check if lapic chip is in-kernel emulation. > - Fix build error for i386 > - Add document to api.rst > - typo in the commit message > --- > Documentation/virt/kvm/api.rst | 14 ++++++++++++++ > arch/x86/kvm/x86.c | 35 ++++++++++++++++++++++++++++++++++ > include/uapi/linux/kvm.h | 1 + > 3 files changed, 50 insertions(+) > > diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst > index 7025b3751027..cc976df2651e 100644 > --- a/Documentation/virt/kvm/api.rst > +++ b/Documentation/virt/kvm/api.rst > @@ -7858,6 +7858,20 @@ This capability is aimed to mitigate the threat that malicious VMs can > cause CPU stuck (due to event windows don't open up) and make the CPU > unavailable to host or other VMs. > > +7.34 KVM_CAP_X86_BUS_FREQUENCY_CONTROL > +-------------------------------------- > + > +:Architectures: x86 > +:Target: VM > +:Parameters: args[0] is the value of apic bus clock frequency > +:Returns: 0 on success, -EINVAL if args[0] contains invalid value for the > + frequency, or -ENXIO if virtual local APIC isn't enabled by > + KVM_CREATE_IRQCHIP, or -EBUSY if any vcpu is created. > + > +This capability sets the APIC bus clock frequency (or core crystal clock > +frequency) for kvm to emulate APIC in the kernel. The default value is 1000000 > +(1GHz). > + > 8. Other capabilities. > ====================== > > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c > index a9f4991b3e2e..a8fb862c4f8e 100644 > --- a/arch/x86/kvm/x86.c > +++ b/arch/x86/kvm/x86.c > @@ -4625,6 +4625,7 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext) > case KVM_CAP_ENABLE_CAP: > case KVM_CAP_VM_DISABLE_NX_HUGE_PAGES: > case KVM_CAP_IRQFD_RESAMPLE: > + case KVM_CAP_X86_BUS_FREQUENCY_CONTROL: > r = 1; > break; > case KVM_CAP_EXIT_HYPERCALL: > @@ -6616,6 +6617,40 @@ int kvm_vm_ioctl_enable_cap(struct kvm *kvm, > } > mutex_unlock(&kvm->lock); > break; > + case KVM_CAP_X86_BUS_FREQUENCY_CONTROL: { > + u64 bus_frequency = cap->args[0]; > + u64 bus_cycle_ns; > + > + if (!bus_frequency) > + return -EINVAL; > + /* CPUID[0x15] only support 32bits. */ > + if (bus_frequency != (u32)bus_frequency) > + return -EINVAL; > + > + /* Cast to avoid 64bit division on 32bit platform. */ > + bus_cycle_ns = 1000000000UL / (u32)bus_frequency; > + if (!bus_cycle_ns) > + return -EINVAL; > + > + r = 0; > + mutex_lock(&kvm->lock); > + /* > + * Don't allow to change the frequency dynamically during vcpu > + * running to avoid potentially bizarre behavior. > + */ > + if (kvm->created_vcpus) > + r = -EBUSY; > + /* This is for in-kernel vAPIC emulation. */ > + else if (!irqchip_in_kernel(kvm)) > + r = -ENXIO; > + > + if (!r) { > + kvm->arch.apic_bus_cycle_ns = bus_cycle_ns; > + kvm->arch.apic_bus_frequency = bus_frequency; > + } > + mutex_unlock(&kvm->lock); > + return r; > + } > default: > r = -EINVAL; > break; > diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h > index 211b86de35ac..d74a057df173 100644 > --- a/include/uapi/linux/kvm.h > +++ b/include/uapi/linux/kvm.h > @@ -1201,6 +1201,7 @@ struct kvm_ppc_resize_hpt { > #define KVM_CAP_ARM_EAGER_SPLIT_CHUNK_SIZE 228 > #define KVM_CAP_ARM_SUPPORTED_BLOCK_SIZES 229 > #define KVM_CAP_ARM_SUPPORTED_REG_MASK_RANGES 230 > +#define KVM_CAP_X86_BUS_FREQUENCY_CONTROL 231 > > #ifdef KVM_CAP_IRQ_ROUTING >