[PATCH v2 00/38] x86: Try to wrangle PV clocks vs. TSC

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



This... snowballed a bit.

The bulk of the changes are in kvmclock and TSC, but pretty much every
hypervisor's guest-side code gets touched at some point.  I am reaonsably
confident in the correctness of the KVM changes.  For all other hypervisors,
assume it's completely broken until proven otherwise.

Note, I deliberately omitted:

  Alexey Makhalov <alexey.amakhalov@xxxxxxxxxxxx>
  jailhouse-dev@xxxxxxxxxxxxxxxx

from the To/Cc, as those emails bounced on the last version, and I have zero
desire to get 38*2 emails telling me an email couldn't be delivered.

The primary goal of this series is (or at least was, when I started) to
fix flaws with SNP and TDX guests where a PV clock provided by the untrusted
hypervisor is used instead of the secure/trusted TSC that is controlled by
trusted firmware.

The secondary goal is to draft off of the SNP and TDX changes to slightly
modernize running under KVM.  Currently, KVM guests will use TSC for
clocksource, but not sched_clock.  And they ignore Intel's CPUID-based TSC
and CPU frequency enumeration, even when using the TSC instead of kvmclock.
And if the host provides the core crystal frequency in CPUID.0x15, then KVM
guests can use that for the APIC timer period instead of manually calibrating
the frequency.

Lots more background on the SNP/TDX motiviation:
https://lore.kernel.org/all/20250106124633.1418972-13-nikunj@xxxxxxx

v2:
 - Add struct to hold the TSC CPUID output. [Boris]
 - Don't pointlessly inline the TSC CPUID helpers. [Boris]
 - Fix a variable goof in a helper, hopefully for real this time. [Dan]
 - Collect reviews. [Nikunj]
 - Override the sched_clock save/restore hooks if and only if a PV clock
   is successfully registered.
 - During resome, restore clocksources before reading persistent time.
 - Clean up more warts created by kvmclock.
 - Fix more bugs in kvmclock's suspend/resume handling.
 - Try to harden kvmclock against future bugs.

v1: https://lore.kernel.org/all/20250201021718.699411-1-seanjc@xxxxxxxxxx

Sean Christopherson (38):
  x86/tsc: Add a standalone helpers for getting TSC info from CPUID.0x15
  x86/tsc: Add standalone helper for getting CPU frequency from CPUID
  x86/tsc: Add helper to register CPU and TSC freq calibration routines
  x86/sev: Mark TSC as reliable when configuring Secure TSC
  x86/sev: Move check for SNP Secure TSC support to tsc_early_init()
  x86/tdx: Override PV calibration routines with CPUID-based calibration
  x86/acrn: Mark TSC frequency as known when using ACRN for calibration
  clocksource: hyper-v: Register sched_clock save/restore iff it's
    necessary
  clocksource: hyper-v: Drop wrappers to sched_clock save/restore
    helpers
  clocksource: hyper-v: Don't save/restore TSC offset when using HV
    sched_clock
  x86/kvmclock: Setup kvmclock for secondary CPUs iff CONFIG_SMP=y
  x86/kvm: Don't disable kvmclock on BSP in syscore_suspend()
  x86/paravirt: Move handling of unstable PV clocks into
    paravirt_set_sched_clock()
  x86/kvmclock: Move sched_clock save/restore helpers up in kvmclock.c
  x86/xen/time: Nullify x86_platform's sched_clock save/restore hooks
  x86/vmware: Nullify save/restore hooks when using VMware's sched_clock
  x86/tsc: WARN if TSC sched_clock save/restore used with PV sched_clock
  x86/paravirt: Pass sched_clock save/restore helpers during
    registration
  x86/kvmclock: Move kvm_sched_clock_init() down in kvmclock.c
  x86/xen/time: Mark xen_setup_vsyscall_time_info() as __init
  x86/pvclock: Mark setup helpers and related various as
    __init/__ro_after_init
  x86/pvclock: WARN if pvclock's valid_flags are overwritten
  x86/kvmclock: Refactor handling of PVCLOCK_TSC_STABLE_BIT during
    kvmclock_init()
  timekeeping: Resume clocksources before reading persistent clock
  x86/kvmclock: Hook clocksource.suspend/resume when kvmclock isn't
    sched_clock
  x86/kvmclock: WARN if wall clock is read while kvmclock is suspended
  x86/kvmclock: Enable kvmclock on APs during onlining if kvmclock isn't
    sched_clock
  x86/paravirt: Mark __paravirt_set_sched_clock() as __init
  x86/paravirt: Plumb a return code into __paravirt_set_sched_clock()
  x86/paravirt: Don't use a PV sched_clock in CoCo guests with trusted
    TSC
  x86/tsc: Pass KNOWN_FREQ and RELIABLE as params to registration
  x86/tsc: Rejects attempts to override TSC calibration with lesser
    routine
  x86/kvmclock: Mark TSC as reliable when it's constant and nonstop
  x86/kvmclock: Get CPU base frequency from CPUID when it's available
  x86/kvmclock: Get TSC frequency from CPUID when its available
  x86/kvmclock: Stuff local APIC bus period when core crystal freq comes
    from CPUID
  x86/kvmclock: Use TSC for sched_clock if it's constant and non-stop
  x86/paravirt: kvmclock: Setup kvmclock early iff it's sched_clock

 arch/x86/coco/sev/core.c           |   9 +-
 arch/x86/coco/tdx/tdx.c            |  27 ++-
 arch/x86/include/asm/kvm_para.h    |  10 +-
 arch/x86/include/asm/paravirt.h    |  16 +-
 arch/x86/include/asm/tdx.h         |   2 +
 arch/x86/include/asm/tsc.h         |  20 +++
 arch/x86/include/asm/x86_init.h    |   2 -
 arch/x86/kernel/cpu/acrn.c         |   5 +-
 arch/x86/kernel/cpu/mshyperv.c     |  69 +-------
 arch/x86/kernel/cpu/vmware.c       |  11 +-
 arch/x86/kernel/jailhouse.c        |   6 +-
 arch/x86/kernel/kvm.c              |  39 +++--
 arch/x86/kernel/kvmclock.c         | 260 +++++++++++++++++++++--------
 arch/x86/kernel/paravirt.c         |  35 +++-
 arch/x86/kernel/pvclock.c          |   9 +-
 arch/x86/kernel/smpboot.c          |   2 +-
 arch/x86/kernel/tsc.c              | 141 ++++++++++++----
 arch/x86/kernel/x86_init.c         |   1 -
 arch/x86/mm/mem_encrypt_amd.c      |   3 -
 arch/x86/xen/time.c                |  13 +-
 drivers/clocksource/hyperv_timer.c |  38 +++--
 include/clocksource/hyperv_timer.h |   2 -
 kernel/time/timekeeping.c          |   9 +-
 23 files changed, 487 insertions(+), 242 deletions(-)


base-commit: a64dcfb451e254085a7daee5fe51bf22959d52d3
-- 
2.48.1.711.g2feabab25a-goog





[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux