----- sean.j.christopherson@xxxxxxxxx wrote: > A VMX preemption timer value of '0' at the time of VMEnter is > architecturally guaranteed to cause a VMExit prior to the CPU > executing any instructions in the guest. This architectural > definition is in place to ensure that a previously expired timer > is correctly recognized by the CPU as it is possible for the timer > to reach zero and not trigger a VMexit due to a higher priority > VMExit being signalled instead, e.g. a pending #DB that morphs into > a VMExit. > > Whether by design or coincidence, commit f4124500c2c1 ("KVM: nVMX: > Fully emulate preemption timer") special cased timer values of '0' > and '1' to ensure prompt delivery of the VMExit. Unlike '0', a > timer value of '1' has no has no architectural guarantees regarding > when it is delivered. > > Modify the timer emulation to trigger immediate VMExit if and only > if the timer value is '0', and document precisely why '0' is special. > Do this even if calibration of the virtual TSC failed, i.e. VMExit > will occur immediately regardless of the frequency of the timer. > Making only '0' a special case gives KVM leeway to be more aggressive > in ensuring the VMExit is injected prior to executing instructions in > the nested guest, and also eliminates any ambiguity as to why '1' is > a special case, e.g. why wasn't the threshold for a "short timeout" > set to 10, 100, 1000, etc... > > Signed-off-by: Sean Christopherson <sean.j.christopherson@xxxxxxxxx> > --- > arch/x86/kvm/vmx.c | 14 ++++++++------ > 1 file changed, 8 insertions(+), 6 deletions(-) > > diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c > index 8dae47e7267a..04afaaeb27a7 100644 > --- a/arch/x86/kvm/vmx.c > +++ b/arch/x86/kvm/vmx.c > @@ -11430,16 +11430,18 @@ static void > vmx_start_preemption_timer(struct kvm_vcpu *vcpu) > u64 preemption_timeout = > get_vmcs12(vcpu)->vmx_preemption_timer_value; > struct vcpu_vmx *vmx = to_vmx(vcpu); > > - if (vcpu->arch.virtual_tsc_khz == 0) > - return; > - > - /* Make sure short timeouts reliably trigger an immediate vmexit. > - * hrtimer_start does not guarantee this. */ > - if (preemption_timeout <= 1) { > + /* > + * A timer value of zero is architecturally guaranteed to cause > + * a VMExit prior to executing any instructions in the guest. > + */ > + if (preemption_timeout == 0) { > vmx_preemption_timer_fn(&vmx->nested.preemption_timer); > return; > } > > + if (vcpu->arch.virtual_tsc_khz == 0) > + return; > + > preemption_timeout <<= VMX_MISC_EMULATED_PREEMPTION_TIMER_RATE; > preemption_timeout *= 1000000; > do_div(preemption_timeout, vcpu->arch.virtual_tsc_khz); > -- > 2.18.0 Reviewed-by: Liran Alon <liran.alon@xxxxxxxxxx>