On Tue, 24 May 2016 16:11:29 -0700 David Matlack <dmatlack@xxxxxxxxxx> wrote: > On Tue, May 24, 2016 at 3:27 PM, Yunhong Jiang > <yunhong.jiang@xxxxxxxxxxxxxxx> wrote: > > From: Yunhong Jiang <yunhong.jiang@xxxxxxxxx> > > > > Utilizing the VMX preemption timer for tsc deadline timer > > virtualization. The VMX preemption timer is armed when the vCPU is > > running, and a VMExit will happen if the virtual TSC deadline timer > > expires. > > > > When the vCPU thread is scheduled out, the tsc deadline timer > > virtualization will be switched to use the current solution, i.e. > > use the timer for it. It's switched back to VMX preemption timer > > when the vCPU thread is scheduled int. > > > > This solution avoids the complex OS's hrtimer system, and also the > > host timer interrupt handling cost, with a preemption_timer VMexit. > > It fits well for some NFV usage scenario, when the vCPU is bound to > > a pCPU and the pCPU is isolated, or some similar scenario. > > > > However, it possibly has impact if the vCPU thread is scheduled > > in/out very frequently, because it switches from/to the hrtimer > > emulation a lot. > > What is the cost of the extra sched-in/out hooks? The cost is because on each sched-in/out, the kvm_sched_in/out need switch between the sw/hv timer, including call hrtimer_start()/hrtimer_cancel() to reprogram the hrtimer, update vmcs to arm/unarm the vmx preemption timer. I think the hrtimer_start/cancel is costly although I have no data on hand. Or you mean I should provide the data for it? Do you have any suggestion on workload for compare the difference? I think the schedule hook will not impact the guest itself, instead, it makes the scheduler procedure longer thus impact host system performance. > > +void kvm_lapic_arm_hv_timer(struct kvm_vcpu *vcpu) > > +{ > > + struct kvm_lapic *apic = vcpu->arch.apic; > > + u64 tscdeadline, guest_tsc; > > + > > + if (apic->lapic_timer.hv_timer_state == HV_TIMER_NOT_USED) > > + return; > > + > > + tscdeadline = apic->lapic_timer.tscdeadline; > > + guest_tsc = kvm_read_l1_tsc(vcpu, rdtsc()); > > + > > + if (tscdeadline >= guest_tsc) > > + kvm_x86_ops->set_hv_timer(vcpu, tscdeadline - > > guest_tsc); > > Does this interact correctly with TSC scaling? IIUC this programs the > VMX preemption timer with a delay in guest cycles, rather than host > cycles. Aha, good point, thanks! But I re-checked the lapic.c and seems this issue has never been considered? Check http://lxr.free-electrons.com/source/arch/x86/kvm/lapic.c#L1335 and lxr.free-electrons.com/source/arch/x86/kvm/lapic.c#L1335 for example. Also I tried to find a function to translate guest tsc to host tsc and didn't find such function at all. Did I miss anything? Thanks --jyh > > -- > > 1.8.3.1 > > > > -- > > To unsubscribe from this list: send the line "unsubscribe kvm" in > > the body of a message to majordomo@xxxxxxxxxxxxxxx > > More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html