2017-10-05 18:54-0700, Wanpeng Li: > From: Wanpeng Li <wanpeng.li@xxxxxxxxxxx> > > The description in the Intel SDM of how the divide configuration > register is used: "The APIC timer frequency will be the processor's bus > clock or core crystal clock frequency divided by the value specified in > the divide configuration register." > > Observation of baremetal shown that when the TDCR is change, the TMCCT > does not change or make a big jump in value, but the rate at which it > count down change. > > The patch update the emulation to APIC timer to so that a change to the > divide configuration would be reflected in the value of the counter and > when the next interrupt is triggered. > > Cc: Paolo Bonzini <pbonzini@xxxxxxxxxx> > Cc: Radim Krčmář <rkrcmar@xxxxxxxxxx> > Signed-off-by: Wanpeng Li <wanpeng.li@xxxxxxxxxxx> > --- > diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c > @@ -1458,6 +1458,36 @@ static void start_sw_period(struct kvm_lapic *apic) > HRTIMER_MODE_ABS_PINNED); > } > > +static bool update_target_expiration(struct kvm_lapic *apic, uint32_t old_divisor) > +{ > + ktime_t now, remaining; > + u64 tscl = rdtsc(), delta; > + > + now = ktime_get(); > + remaining = ktime_sub(apic->lapic_timer.target_expiration, now); > + if (ktime_to_ns(remaining) < 0) > + remaining = 0; > + delta = mod_64(ktime_to_ns(remaining), apic->lapic_timer.period); Hm, can this happen? > + if (!delta) > + return false; > + > + apic->lapic_timer.period = (u64)kvm_lapic_get_reg(apic, APIC_TMICT) > + * APIC_BUS_CYCLE_NS * apic->divide_count; I think that it would be safer to always modify the period. > + delta = delta * apic->divide_count / old_divisor; > + > + if (!apic->lapic_timer.period) > + return false; > + > + limit_periodic_timer_frequency(apic); > + > + apic->lapic_timer.tscdeadline = kvm_read_l1_tsc(apic->vcpu, tscl) + > + nsec_to_cycles(apic->vcpu, delta); We could do that without rdtsc() for added precision and maybe performance: apic->lapic_timer.tscdeadline += nsec_to_cycles(apic->vcpu, delta) - nsec_to_cycles(apic->vcpu, remaining); // not sure how a negative operand would behave: // nsec_to_cycles(apic->vcpu, delta - remaining) > + apic->lapic_timer.target_expiration = ktime_add_ns(now, delta); > + > + return true; > +} > + > static bool set_target_expiration(struct kvm_lapic *apic) > { > ktime_t now; > @@ -1750,13 +1780,20 @@ int kvm_lapic_reg_write(struct kvm_lapic *apic, u32 reg, u32 val) > start_apic_timer(apic); > break; > > - case APIC_TDCR: > + case APIC_TDCR: { > + uint32_t old_divisor = apic->divide_count; > + > if (val & 4) > apic_debug("KVM_WRITE:TDCR %x\n", val); > kvm_lapic_set_reg(apic, APIC_TDCR, val); > update_divide_count(apic); > + if (apic->divide_count != old_divisor) { > + hrtimer_cancel(&apic->lapic_timer.timer); > + if (update_target_expiration(apic, old_divisor)) > + restart_apic_timer(apic); I think we can lose a timer here when we cancel a hrtimer whose expiration time passes before update_target_expiration(), so it never gets restarted. Doing restart_apic_timer() unconditionally seems better. It behaves well if we try to restart a timer that has already fired. Thanks. > + } > break; > - > + } > case APIC_ESR: > if (apic_x2apic_mode(apic) && val != 0) { > apic_debug("KVM_WRITE:ESR not zero %x\n", val); > -- > 2.7.4 >