Hello again,
Just a follow-up with some more info. One of my vendors suggested a
different workaround for the clock issue.
"-machine accel=kvm,kernel_irqchip=off"
Setting "kernel_irqchip=off" seems to resolve the timing issue. I ran
this for roughly 2 hrs on my test hardware and didn't experience the
timing issue (normally it becomes present after 10 mins). However, as
far as I understand, this flag moves some interrupt handling to a
userspace emulator which results in lower overall performance.
This is what I noticed with the different irqchip modes:
Without kernel_irqchip specified: timing issue became present after
10 mins.
With kernel_irqchip=off: no timing issue present, ran for 2 hrs.
With kernel_irqchip=split: timing issue became present after 9
mins.
Best regards,
Andrew
On 1/3/2018 7:19 PM, Paolo Bonzini wrote:
On 04/01/2018 01:01, Andrew Davis wrote:
Hello again,
I'm glad to see that you were able to make sense of those test samples :).
Is that behavior with the preemption timer normal at the KVM level? (I'm
guessing not!).
Indeed, not at all. :) It's probably some kind of rounding error that
gets bigger and bigger, one bit at a time, or an overflow.
Paolo
OpenBSD has timer tick sysctl variables, but they can't be changed
without recompiling the kernel. Here is what the default layout looks
like on OpenBSD 6.1:
# sysctl -a | grep -i clockrate
kern.clockrate=tick = 10000, tickadj = 40, hz = 100, profhz = 100,
stathz = 100
Best regards,
Andrew
On 1/2/2018 7:02 PM, Paolo Bonzini wrote:
On 02/01/2018 20:02, Andrew Davis wrote:
Here's the post-boot/2sec/4sec experiment run again, with "trace-cmd
record -e kvm".
http://45.63.6.248/kvm3/trace-01-boot.tar.gz
http://45.63.6.248/kvm3/trace-02-2sec.tar.gz
http://45.63.6.248/kvm3/trace-03-4sec.tar.gz
http://45.63.6.248/kvm3/env.txt
That's pretty interesting. The normal trace looks like this (one CPU
only and with some traces omitted):
CPU-1135 [014] 440078.978674: kvm_exit:
reason HLT rip 0xffffffff816a0b6b info 0 0
CPU-1135 [014] 440078.988645: kvm_vcpu_wakeup:
wait time 9967011 ns, polling valid
CPU-1135 [014] 440078.988660: kvm_exit:
reason HLT rip 0xffffffff816a0b6b info 0 0
CPU-1135 [014] 440078.998645: kvm_vcpu_wakeup:
wait time 9980730 ns, polling valid
CPU-1135 [014] 440078.998660: kvm_exit:
reason HLT rip 0xffffffff816a0b6b info 0 0
CPU-1135 [014] 440079.008646: kvm_vcpu_wakeup:
wait time 9982416 ns, polling valid
CPU-1135 [014] 440079.008664: kvm_exit:
reason HLT rip 0xffffffff816a0b6b info 0 0
CPU-1135 [014] 440079.018644: kvm_vcpu_wakeup:
wait time 9976479 ns, polling valid
CPU-1135 [014] 440079.018659: kvm_exit:
reason HLT rip 0xffffffff816a0b6b info 0 0
CPU-1135 [014] 440079.028644: kvm_vcpu_wakeup:
wait time 9981113 ns, polling valid
CPU-1135 [014] 440079.028659: kvm_exit:
reason HLT rip 0xffffffff816a0b6b info 0 0
CPU-1135 [014] 440079.038648: kvm_vcpu_wakeup:
wait time 9985185 ns, polling valid
CPU-1135 [014] 440079.038663: kvm_exit:
reason HLT rip 0xffffffff816a0b6b info 0 0
CPU-1135 [014] 440079.048646: kvm_vcpu_wakeup:
wait time 9979267 ns, polling valid
Then, it looks like this:
CPU-1135 [027] 440530.308689: kvm_vcpu_wakeup:
wait time 19968272 ns, polling valid
CPU-1135 [027] 440530.308696: kvm_exit:
reason PREEMPTION_TIMER rip 0xffffffff816a0b6c info 0 0
CPU-1135 [027] 440530.308704: kvm_exit:
reason HLT rip 0xffffffff816a0b6b info 0 0
CPU-1135 [027] 440530.328690: kvm_vcpu_wakeup:
wait time 19984070 ns, polling valid
CPU-1135 [027] 440530.328698: kvm_exit:
reason PREEMPTION_TIMER rip 0xffffffff816a0b6c info 0 0
CPU-1135 [027] 440530.328705: kvm_exit:
reason HLT rip 0xffffffff816a0b6b info 0 0
CPU-1135 [027] 440530.348691: kvm_vcpu_wakeup:
wait time 19982890 ns, polling valid
CPU-1135 [027] 440530.348699: kvm_exit:
reason PREEMPTION_TIMER rip 0xffffffff816a0b6c info 0 0
CPU-1135 [027] 440530.348707: kvm_exit:
reason HLT rip 0xffffffff816a0b6b info 0 0
CPU-1135 [027] 440530.368691: kvm_vcpu_wakeup:
wait time 19981827 ns, polling valid
CPU-1135 [027] 440530.368699: kvm_exit:
reason PREEMPTION_TIMER rip 0xffffffff816a0b6c info 0 0
CPU-1135 [027] 440530.368707: kvm_exit:
reason HLT rip 0xffffffff816a0b6b info 0 0
CPU-1135 [027] 440530.388692: kvm_vcpu_wakeup:
wait time 19981959 ns, polling valid
And then it looks like this:
CPU-1135 [014] 441652.963757: kvm_exit:
reason PREEMPTION_TIMER rip 0xffffffff816a0b6c info 0 0
CPU-1135 [014] 441652.963764: kvm_exit:
reason PREEMPTION_TIMER rip 0xffffffff816a0b6c info 0 0
CPU-1135 [014] 441652.963768: kvm_exit:
reason PREEMPTION_TIMER rip 0xffffffff816a0b6c info 0 0
CPU-1135 [014] 441652.963775: kvm_exit:
reason HLT rip 0xffffffff816a0b6b info 0 0
CPU-1135 [014] 441653.003745: kvm_vcpu_wakeup:
wait time 39964879 ns, polling valid
CPU-1135 [014] 441653.003758: kvm_exit:
reason PREEMPTION_TIMER rip 0xffffffff816a0b6c info 0 0
CPU-1135 [014] 441653.003765: kvm_exit:
reason PREEMPTION_TIMER rip 0xffffffff816a0b6c info 0 0
CPU-1135 [014] 441653.003772: kvm_exit:
reason PREEMPTION_TIMER rip 0xffffffff816a0b6c info 0 0
CPU-1135 [014] 441653.003783: kvm_exit:
reason HLT rip 0xffffffff816a0b6b info 0 0
CPU-1135 [014] 441653.043745: kvm_vcpu_wakeup:
wait time 39957085 ns, polling valid
CPU-1135 [014] 441653.043765: kvm_exit:
reason PREEMPTION_TIMER rip 0xffffffff816a0b6c info 0 0
CPU-1135 [014] 441653.043770: kvm_exit:
reason PREEMPTION_TIMER rip 0xffffffff816a0b6c info 0 0
CPU-1135 [014] 441653.043774: kvm_exit:
reason PREEMPTION_TIMER rip 0xffffffff816a0b6c info 0 0
CPU-1135 [014] 441653.043783: kvm_exit:
reason HLT rip 0xffffffff816a0b6b info 0 0
CPU-1135 [014] 441653.083734: kvm_vcpu_wakeup:
wait time 39945744 ns, polling valid
So the preemption timer is set to a very small value after some time.
This causes
the deadline to be advanced, and the next HLT to wake up after twice
the correct
deadline.
Then the premption timer is set to a very small value for three times
in a row,
each of which advances the deadline; the next HLT then wakes up after
four times
the regular time.
This validates your observation that disabling the preemption timer
works around
the bug. I'll take a look and try to create a test case. Out of
curiosity, can
the OpenBSD timer tick can be adjusted? If so, does the bug trigger
faster if
you increase the frequency (or slower if you decrease it)?
Thanks,
Paolo