Re: Timing issue with QEMU/KVM and OpenBSD

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hello again,

Just a follow-up with some more info. One of my vendors suggested a different workaround for the clock issue.

"-machine accel=kvm,kernel_irqchip=off"

Setting "kernel_irqchip=off" seems to resolve the timing issue. I ran this for roughly 2 hrs on my test hardware and didn't experience the timing issue (normally it becomes present after 10 mins). However, as far as I understand, this flag moves some interrupt handling to a userspace emulator which results in lower overall performance.

This is what I noticed with the different irqchip modes:

Without kernel_irqchip specified:    timing issue became present after 10 mins.
With kernel_irqchip=off:             no timing issue present, ran for 2 hrs.
With kernel_irqchip=split:           timing issue became present after 9 mins.

Best regards,
Andrew


On 1/3/2018 7:19 PM, Paolo Bonzini wrote:
On 04/01/2018 01:01, Andrew Davis wrote:
Hello again,

I'm glad to see that you were able to make sense of those test samples :).

Is that behavior with the preemption timer normal at the KVM level? (I'm
guessing not!).
Indeed, not at all. :)  It's probably some kind of rounding error that
gets bigger and bigger, one bit at a time, or an overflow.

Paolo

OpenBSD has timer tick sysctl variables, but they can't be changed
without recompiling the kernel. Here is what the default layout looks
like on OpenBSD 6.1:

     # sysctl -a | grep -i clockrate
     kern.clockrate=tick = 10000, tickadj = 40, hz = 100, profhz = 100,
stathz = 100

Best regards,
Andrew


On 1/2/2018 7:02 PM, Paolo Bonzini wrote:
On 02/01/2018 20:02, Andrew Davis wrote:
Here's the post-boot/2sec/4sec experiment run again, with "trace-cmd
record -e kvm".

http://45.63.6.248/kvm3/trace-01-boot.tar.gz
http://45.63.6.248/kvm3/trace-02-2sec.tar.gz
http://45.63.6.248/kvm3/trace-03-4sec.tar.gz
http://45.63.6.248/kvm3/env.txt
That's pretty interesting.  The normal trace looks like this (one CPU
only and with some traces omitted):

               CPU-1135  [014] 440078.978674: kvm_exit:
reason HLT rip 0xffffffff816a0b6b info 0 0
               CPU-1135  [014] 440078.988645: kvm_vcpu_wakeup:
wait time 9967011 ns, polling valid
               CPU-1135  [014] 440078.988660: kvm_exit:
reason HLT rip 0xffffffff816a0b6b info 0 0
               CPU-1135  [014] 440078.998645: kvm_vcpu_wakeup:
wait time 9980730 ns, polling valid
               CPU-1135  [014] 440078.998660: kvm_exit:
reason HLT rip 0xffffffff816a0b6b info 0 0
               CPU-1135  [014] 440079.008646: kvm_vcpu_wakeup:
wait time 9982416 ns, polling valid
               CPU-1135  [014] 440079.008664: kvm_exit:
reason HLT rip 0xffffffff816a0b6b info 0 0
               CPU-1135  [014] 440079.018644: kvm_vcpu_wakeup:
wait time 9976479 ns, polling valid
               CPU-1135  [014] 440079.018659: kvm_exit:
reason HLT rip 0xffffffff816a0b6b info 0 0
               CPU-1135  [014] 440079.028644: kvm_vcpu_wakeup:
wait time 9981113 ns, polling valid
               CPU-1135  [014] 440079.028659: kvm_exit:
reason HLT rip 0xffffffff816a0b6b info 0 0
               CPU-1135  [014] 440079.038648: kvm_vcpu_wakeup:
wait time 9985185 ns, polling valid
               CPU-1135  [014] 440079.038663: kvm_exit:
reason HLT rip 0xffffffff816a0b6b info 0 0
               CPU-1135  [014] 440079.048646: kvm_vcpu_wakeup:
wait time 9979267 ns, polling valid

Then, it looks like this:

               CPU-1135  [027] 440530.308689: kvm_vcpu_wakeup:
wait time 19968272 ns, polling valid
               CPU-1135  [027] 440530.308696: kvm_exit:
reason PREEMPTION_TIMER rip 0xffffffff816a0b6c info 0 0
               CPU-1135  [027] 440530.308704: kvm_exit:
reason HLT rip 0xffffffff816a0b6b info 0 0
               CPU-1135  [027] 440530.328690: kvm_vcpu_wakeup:
wait time 19984070 ns, polling valid
               CPU-1135  [027] 440530.328698: kvm_exit:
reason PREEMPTION_TIMER rip 0xffffffff816a0b6c info 0 0
               CPU-1135  [027] 440530.328705: kvm_exit:
reason HLT rip 0xffffffff816a0b6b info 0 0
               CPU-1135  [027] 440530.348691: kvm_vcpu_wakeup:
wait time 19982890 ns, polling valid
               CPU-1135  [027] 440530.348699: kvm_exit:
reason PREEMPTION_TIMER rip 0xffffffff816a0b6c info 0 0
               CPU-1135  [027] 440530.348707: kvm_exit:
reason HLT rip 0xffffffff816a0b6b info 0 0
               CPU-1135  [027] 440530.368691: kvm_vcpu_wakeup:
wait time 19981827 ns, polling valid
               CPU-1135  [027] 440530.368699: kvm_exit:
reason PREEMPTION_TIMER rip 0xffffffff816a0b6c info 0 0
               CPU-1135  [027] 440530.368707: kvm_exit:
reason HLT rip 0xffffffff816a0b6b info 0 0
               CPU-1135  [027] 440530.388692: kvm_vcpu_wakeup:
wait time 19981959 ns, polling valid

And then it looks like this:

               CPU-1135  [014] 441652.963757: kvm_exit:
reason PREEMPTION_TIMER rip 0xffffffff816a0b6c info 0 0
               CPU-1135  [014] 441652.963764: kvm_exit:
reason PREEMPTION_TIMER rip 0xffffffff816a0b6c info 0 0
               CPU-1135  [014] 441652.963768: kvm_exit:
reason PREEMPTION_TIMER rip 0xffffffff816a0b6c info 0 0
               CPU-1135  [014] 441652.963775: kvm_exit:
reason HLT rip 0xffffffff816a0b6b info 0 0
               CPU-1135  [014] 441653.003745: kvm_vcpu_wakeup:
wait time 39964879 ns, polling valid
               CPU-1135  [014] 441653.003758: kvm_exit:
reason PREEMPTION_TIMER rip 0xffffffff816a0b6c info 0 0
               CPU-1135  [014] 441653.003765: kvm_exit:
reason PREEMPTION_TIMER rip 0xffffffff816a0b6c info 0 0
               CPU-1135  [014] 441653.003772: kvm_exit:
reason PREEMPTION_TIMER rip 0xffffffff816a0b6c info 0 0
               CPU-1135  [014] 441653.003783: kvm_exit:
reason HLT rip 0xffffffff816a0b6b info 0 0
               CPU-1135  [014] 441653.043745: kvm_vcpu_wakeup:
wait time 39957085 ns, polling valid
               CPU-1135  [014] 441653.043765: kvm_exit:
reason PREEMPTION_TIMER rip 0xffffffff816a0b6c info 0 0
               CPU-1135  [014] 441653.043770: kvm_exit:
reason PREEMPTION_TIMER rip 0xffffffff816a0b6c info 0 0
               CPU-1135  [014] 441653.043774: kvm_exit:
reason PREEMPTION_TIMER rip 0xffffffff816a0b6c info 0 0
               CPU-1135  [014] 441653.043783: kvm_exit:
reason HLT rip 0xffffffff816a0b6b info 0 0
               CPU-1135  [014] 441653.083734: kvm_vcpu_wakeup:
wait time 39945744 ns, polling valid

So the preemption timer is set to a very small value after some time.
This causes
the deadline to be advanced, and the next HLT to wake up after twice
the correct
deadline.

Then the premption timer is set to a very small value for three times
in a row,
each of which advances the deadline; the next HLT then wakes up after
four times
the regular time.

This validates your observation that disabling the preemption timer
works around
the bug.  I'll take a look and try to create a test case.  Out of
curiosity, can
the OpenBSD timer tick can be adjusted?  If so, does the bug trigger
faster if
you increase the frequency (or slower if you decrease it)?


Thanks,

Paolo




[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux