On 02/01/2018 20:02, Andrew Davis wrote: > > Here's the post-boot/2sec/4sec experiment run again, with "trace-cmd > record -e kvm". > > http://45.63.6.248/kvm3/trace-01-boot.tar.gz > http://45.63.6.248/kvm3/trace-02-2sec.tar.gz > http://45.63.6.248/kvm3/trace-03-4sec.tar.gz > http://45.63.6.248/kvm3/env.txt That's pretty interesting. The normal trace looks like this (one CPU only and with some traces omitted): CPU-1135 [014] 440078.978674: kvm_exit: reason HLT rip 0xffffffff816a0b6b info 0 0 CPU-1135 [014] 440078.988645: kvm_vcpu_wakeup: wait time 9967011 ns, polling valid CPU-1135 [014] 440078.988660: kvm_exit: reason HLT rip 0xffffffff816a0b6b info 0 0 CPU-1135 [014] 440078.998645: kvm_vcpu_wakeup: wait time 9980730 ns, polling valid CPU-1135 [014] 440078.998660: kvm_exit: reason HLT rip 0xffffffff816a0b6b info 0 0 CPU-1135 [014] 440079.008646: kvm_vcpu_wakeup: wait time 9982416 ns, polling valid CPU-1135 [014] 440079.008664: kvm_exit: reason HLT rip 0xffffffff816a0b6b info 0 0 CPU-1135 [014] 440079.018644: kvm_vcpu_wakeup: wait time 9976479 ns, polling valid CPU-1135 [014] 440079.018659: kvm_exit: reason HLT rip 0xffffffff816a0b6b info 0 0 CPU-1135 [014] 440079.028644: kvm_vcpu_wakeup: wait time 9981113 ns, polling valid CPU-1135 [014] 440079.028659: kvm_exit: reason HLT rip 0xffffffff816a0b6b info 0 0 CPU-1135 [014] 440079.038648: kvm_vcpu_wakeup: wait time 9985185 ns, polling valid CPU-1135 [014] 440079.038663: kvm_exit: reason HLT rip 0xffffffff816a0b6b info 0 0 CPU-1135 [014] 440079.048646: kvm_vcpu_wakeup: wait time 9979267 ns, polling valid Then, it looks like this: CPU-1135 [027] 440530.308689: kvm_vcpu_wakeup: wait time 19968272 ns, polling valid CPU-1135 [027] 440530.308696: kvm_exit: reason PREEMPTION_TIMER rip 0xffffffff816a0b6c info 0 0 CPU-1135 [027] 440530.308704: kvm_exit: reason HLT rip 0xffffffff816a0b6b info 0 0 CPU-1135 [027] 440530.328690: kvm_vcpu_wakeup: wait time 19984070 ns, polling valid CPU-1135 [027] 440530.328698: kvm_exit: reason PREEMPTION_TIMER rip 0xffffffff816a0b6c info 0 0 CPU-1135 [027] 440530.328705: kvm_exit: reason HLT rip 0xffffffff816a0b6b info 0 0 CPU-1135 [027] 440530.348691: kvm_vcpu_wakeup: wait time 19982890 ns, polling valid CPU-1135 [027] 440530.348699: kvm_exit: reason PREEMPTION_TIMER rip 0xffffffff816a0b6c info 0 0 CPU-1135 [027] 440530.348707: kvm_exit: reason HLT rip 0xffffffff816a0b6b info 0 0 CPU-1135 [027] 440530.368691: kvm_vcpu_wakeup: wait time 19981827 ns, polling valid CPU-1135 [027] 440530.368699: kvm_exit: reason PREEMPTION_TIMER rip 0xffffffff816a0b6c info 0 0 CPU-1135 [027] 440530.368707: kvm_exit: reason HLT rip 0xffffffff816a0b6b info 0 0 CPU-1135 [027] 440530.388692: kvm_vcpu_wakeup: wait time 19981959 ns, polling valid And then it looks like this: CPU-1135 [014] 441652.963757: kvm_exit: reason PREEMPTION_TIMER rip 0xffffffff816a0b6c info 0 0 CPU-1135 [014] 441652.963764: kvm_exit: reason PREEMPTION_TIMER rip 0xffffffff816a0b6c info 0 0 CPU-1135 [014] 441652.963768: kvm_exit: reason PREEMPTION_TIMER rip 0xffffffff816a0b6c info 0 0 CPU-1135 [014] 441652.963775: kvm_exit: reason HLT rip 0xffffffff816a0b6b info 0 0 CPU-1135 [014] 441653.003745: kvm_vcpu_wakeup: wait time 39964879 ns, polling valid CPU-1135 [014] 441653.003758: kvm_exit: reason PREEMPTION_TIMER rip 0xffffffff816a0b6c info 0 0 CPU-1135 [014] 441653.003765: kvm_exit: reason PREEMPTION_TIMER rip 0xffffffff816a0b6c info 0 0 CPU-1135 [014] 441653.003772: kvm_exit: reason PREEMPTION_TIMER rip 0xffffffff816a0b6c info 0 0 CPU-1135 [014] 441653.003783: kvm_exit: reason HLT rip 0xffffffff816a0b6b info 0 0 CPU-1135 [014] 441653.043745: kvm_vcpu_wakeup: wait time 39957085 ns, polling valid CPU-1135 [014] 441653.043765: kvm_exit: reason PREEMPTION_TIMER rip 0xffffffff816a0b6c info 0 0 CPU-1135 [014] 441653.043770: kvm_exit: reason PREEMPTION_TIMER rip 0xffffffff816a0b6c info 0 0 CPU-1135 [014] 441653.043774: kvm_exit: reason PREEMPTION_TIMER rip 0xffffffff816a0b6c info 0 0 CPU-1135 [014] 441653.043783: kvm_exit: reason HLT rip 0xffffffff816a0b6b info 0 0 CPU-1135 [014] 441653.083734: kvm_vcpu_wakeup: wait time 39945744 ns, polling valid So the preemption timer is set to a very small value after some time. This causes the deadline to be advanced, and the next HLT to wake up after twice the correct deadline. Then the premption timer is set to a very small value for three times in a row, each of which advances the deadline; the next HLT then wakes up after four times the regular time. This validates your observation that disabling the preemption timer works around the bug. I'll take a look and try to create a test case. Out of curiosity, can the OpenBSD timer tick can be adjusted? If so, does the bug trigger faster if you increase the frequency (or slower if you decrease it)? Thanks, Paolo