On Thu, Dec 11, 2014 at 07:27:17PM -0200, Marcelo Tosatti wrote: > On Thu, Dec 11, 2014 at 01:16:52PM -0800, Andy Lutomirski wrote: > > On Thu, Dec 11, 2014 at 1:10 PM, Paolo Bonzini <pbonzini@xxxxxxxxxx> wrote: > > > > > > > > > On 11/12/2014 21:48, Andy Lutomirski wrote: > > >> On 12/10/2014 07:07 PM, Marcelo Tosatti wrote: > > >>> On Thu, Dec 11, 2014 at 12:37:57AM +0100, Paolo Bonzini wrote: > > >>>> > > >>>> > > >>>> On 10/12/2014 21:57, Marcelo Tosatti wrote: > > >>>>> For the hrtimer which emulates the tscdeadline timer in the guest, > > >>>>> add an option to advance expiration, and busy spin on VM-entry waiting > > >>>>> for the actual expiration time to elapse. > > >>>>> > > >>>>> This allows achieving low latencies in cyclictest (or any scenario > > >>>>> which requires strict timing regarding timer expiration). > > >>>>> > > >>>>> Reduces cyclictest avg latency by 50%. > > >>>>> > > >>>>> Note: this option requires tuning to find the appropriate value > > >>>>> for a particular hardware/guest combination. One method is to measure the > > >>>>> average delay between apic_timer_fn and VM-entry. > > >>>>> Another method is to start with 1000ns, and increase the value > > >>>>> in say 500ns increments until avg cyclictest numbers stop decreasing. > > >>>> > > >>>> What values are you using in practice for the parameter? > > >>> > > >>> 7us. > > >> > > >> It takes 7us to get from TSC deadline expiration to the *start* of > > >> vmresume? That seems rather extreme. > > > > > > No, to the end. 7us is 21000 clock cycles, and the vmexit+vmentry alone > > > costs about 1300. > > > > > > > I suspect that something's massively wrong with context switching, > > then -- it deserves to be considerably faster than that. The > > architecturally expensive bits are vmresume, interrupt delivery, and > > iret, but iret is only ~300 cycles and interrupt delivery should be > > under 1k cycles. > > > > Throw in a few hundred more cycles for whatever wrmsr idiocy is going > > on somewhere in the process, and we're still nowhere near 21k cycles. > > > <idle>-0 [003] d..h2.. 1991756745496752: apic_timer_fn > <-__run_hrtimer > <idle>-0 [003] dN.h2.. 1991756745498732: tick_program_event <-hrtimer_interrupt > <idle>-0 [003] d...3.. 1991756745502112: sched_switch: prev_comm=swapper/3 prev_pid=0 prev_prio=120 prev_state=R ==> next_comm=qemu-system-x86 next_pid=20114 next_prio=98 > <idle>-0 [003] d...2.. 1991756745502592: __context_tracking_task_switch <-__schedule > qemu-system-x86-20114 [003] ....1.. 1991756745503916: kvm_arch_vcpu_load <-kvm_sched_in > qemu-system-x86-20114 [003] ....... 1991756745505320: kvm_cpu_has_pending_timer <-kvm_vcpu_block > qemu-system-x86-20114 [003] ....... 1991756745506260: kvm_cpu_has_pending_timer <-kvm_arch_vcpu_ioctl_run > qemu-system-x86-20114 [003] ....... 1991756745507812: kvm_apic_accept_events <-kvm_arch_vcpu_ioctl_run > qemu-system-x86-20114 [003] ....... 1991756745508100: kvm_cpu_has_pending_timer <-kvm_arch_vcpu_ioctl_run > qemu-system-x86-20114 [003] ....... 1991756745508872: kvm_apic_accept_events <-vcpu_enter_guest > qemu-system-x86-20114 [003] ....1.. 1991756745510040: vmx_save_host_state <-vcpu_enter_guest > qemu-system-x86-20114 [003] d...2.. 1991756745511876: kvm_entry: vcpu 1 > > > 1991756745511876 - 1991756745496752 = 15124 > > The timestamps are TSC reads. > > This is patched to run without ksoftirqd. Consider: > > The LAPIC is programmed to the next earliest event by hrtimer_interrupt. > VM-entry is processing KVM_REQ_DEACTIVATE_FPU, KVM_REQ_EVENT. > model : 58 model name : Intel(R) Core(TM) i5-3470S CPU @ 2.90GHz stepping : 9 microcode : 0x1b cpu MHz : 2873.492 cache size : 6144 KB -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html