On 04/01/2018 01:01, Andrew Davis wrote: > Hello again, > > I'm glad to see that you were able to make sense of those test samples :). > > Is that behavior with the preemption timer normal at the KVM level? (I'm > guessing not!). Indeed, not at all. :) It's probably some kind of rounding error that gets bigger and bigger, one bit at a time, or an overflow. Paolo > OpenBSD has timer tick sysctl variables, but they can't be changed > without recompiling the kernel. Here is what the default layout looks > like on OpenBSD 6.1: > > # sysctl -a | grep -i clockrate > kern.clockrate=tick = 10000, tickadj = 40, hz = 100, profhz = 100, > stathz = 100 > > Best regards, > Andrew > > > On 1/2/2018 7:02 PM, Paolo Bonzini wrote: >> On 02/01/2018 20:02, Andrew Davis wrote: >>> Here's the post-boot/2sec/4sec experiment run again, with "trace-cmd >>> record -e kvm". >>> >>> http://45.63.6.248/kvm3/trace-01-boot.tar.gz >>> http://45.63.6.248/kvm3/trace-02-2sec.tar.gz >>> http://45.63.6.248/kvm3/trace-03-4sec.tar.gz >>> http://45.63.6.248/kvm3/env.txt >> That's pretty interesting. The normal trace looks like this (one CPU >> only and with some traces omitted): >> >> CPU-1135 [014] 440078.978674: kvm_exit: >> reason HLT rip 0xffffffff816a0b6b info 0 0 >> CPU-1135 [014] 440078.988645: kvm_vcpu_wakeup: >> wait time 9967011 ns, polling valid >> CPU-1135 [014] 440078.988660: kvm_exit: >> reason HLT rip 0xffffffff816a0b6b info 0 0 >> CPU-1135 [014] 440078.998645: kvm_vcpu_wakeup: >> wait time 9980730 ns, polling valid >> CPU-1135 [014] 440078.998660: kvm_exit: >> reason HLT rip 0xffffffff816a0b6b info 0 0 >> CPU-1135 [014] 440079.008646: kvm_vcpu_wakeup: >> wait time 9982416 ns, polling valid >> CPU-1135 [014] 440079.008664: kvm_exit: >> reason HLT rip 0xffffffff816a0b6b info 0 0 >> CPU-1135 [014] 440079.018644: kvm_vcpu_wakeup: >> wait time 9976479 ns, polling valid >> CPU-1135 [014] 440079.018659: kvm_exit: >> reason HLT rip 0xffffffff816a0b6b info 0 0 >> CPU-1135 [014] 440079.028644: kvm_vcpu_wakeup: >> wait time 9981113 ns, polling valid >> CPU-1135 [014] 440079.028659: kvm_exit: >> reason HLT rip 0xffffffff816a0b6b info 0 0 >> CPU-1135 [014] 440079.038648: kvm_vcpu_wakeup: >> wait time 9985185 ns, polling valid >> CPU-1135 [014] 440079.038663: kvm_exit: >> reason HLT rip 0xffffffff816a0b6b info 0 0 >> CPU-1135 [014] 440079.048646: kvm_vcpu_wakeup: >> wait time 9979267 ns, polling valid >> >> Then, it looks like this: >> >> CPU-1135 [027] 440530.308689: kvm_vcpu_wakeup: >> wait time 19968272 ns, polling valid >> CPU-1135 [027] 440530.308696: kvm_exit: >> reason PREEMPTION_TIMER rip 0xffffffff816a0b6c info 0 0 >> CPU-1135 [027] 440530.308704: kvm_exit: >> reason HLT rip 0xffffffff816a0b6b info 0 0 >> CPU-1135 [027] 440530.328690: kvm_vcpu_wakeup: >> wait time 19984070 ns, polling valid >> CPU-1135 [027] 440530.328698: kvm_exit: >> reason PREEMPTION_TIMER rip 0xffffffff816a0b6c info 0 0 >> CPU-1135 [027] 440530.328705: kvm_exit: >> reason HLT rip 0xffffffff816a0b6b info 0 0 >> CPU-1135 [027] 440530.348691: kvm_vcpu_wakeup: >> wait time 19982890 ns, polling valid >> CPU-1135 [027] 440530.348699: kvm_exit: >> reason PREEMPTION_TIMER rip 0xffffffff816a0b6c info 0 0 >> CPU-1135 [027] 440530.348707: kvm_exit: >> reason HLT rip 0xffffffff816a0b6b info 0 0 >> CPU-1135 [027] 440530.368691: kvm_vcpu_wakeup: >> wait time 19981827 ns, polling valid >> CPU-1135 [027] 440530.368699: kvm_exit: >> reason PREEMPTION_TIMER rip 0xffffffff816a0b6c info 0 0 >> CPU-1135 [027] 440530.368707: kvm_exit: >> reason HLT rip 0xffffffff816a0b6b info 0 0 >> CPU-1135 [027] 440530.388692: kvm_vcpu_wakeup: >> wait time 19981959 ns, polling valid >> >> And then it looks like this: >> >> CPU-1135 [014] 441652.963757: kvm_exit: >> reason PREEMPTION_TIMER rip 0xffffffff816a0b6c info 0 0 >> CPU-1135 [014] 441652.963764: kvm_exit: >> reason PREEMPTION_TIMER rip 0xffffffff816a0b6c info 0 0 >> CPU-1135 [014] 441652.963768: kvm_exit: >> reason PREEMPTION_TIMER rip 0xffffffff816a0b6c info 0 0 >> CPU-1135 [014] 441652.963775: kvm_exit: >> reason HLT rip 0xffffffff816a0b6b info 0 0 >> CPU-1135 [014] 441653.003745: kvm_vcpu_wakeup: >> wait time 39964879 ns, polling valid >> CPU-1135 [014] 441653.003758: kvm_exit: >> reason PREEMPTION_TIMER rip 0xffffffff816a0b6c info 0 0 >> CPU-1135 [014] 441653.003765: kvm_exit: >> reason PREEMPTION_TIMER rip 0xffffffff816a0b6c info 0 0 >> CPU-1135 [014] 441653.003772: kvm_exit: >> reason PREEMPTION_TIMER rip 0xffffffff816a0b6c info 0 0 >> CPU-1135 [014] 441653.003783: kvm_exit: >> reason HLT rip 0xffffffff816a0b6b info 0 0 >> CPU-1135 [014] 441653.043745: kvm_vcpu_wakeup: >> wait time 39957085 ns, polling valid >> CPU-1135 [014] 441653.043765: kvm_exit: >> reason PREEMPTION_TIMER rip 0xffffffff816a0b6c info 0 0 >> CPU-1135 [014] 441653.043770: kvm_exit: >> reason PREEMPTION_TIMER rip 0xffffffff816a0b6c info 0 0 >> CPU-1135 [014] 441653.043774: kvm_exit: >> reason PREEMPTION_TIMER rip 0xffffffff816a0b6c info 0 0 >> CPU-1135 [014] 441653.043783: kvm_exit: >> reason HLT rip 0xffffffff816a0b6b info 0 0 >> CPU-1135 [014] 441653.083734: kvm_vcpu_wakeup: >> wait time 39945744 ns, polling valid >> >> So the preemption timer is set to a very small value after some time. >> This causes >> the deadline to be advanced, and the next HLT to wake up after twice >> the correct >> deadline. >> >> Then the premption timer is set to a very small value for three times >> in a row, >> each of which advances the deadline; the next HLT then wakes up after >> four times >> the regular time. >> >> This validates your observation that disabling the preemption timer >> works around >> the bug. I'll take a look and try to create a test case. Out of >> curiosity, can >> the OpenBSD timer tick can be adjusted? If so, does the bug trigger >> faster if >> you increase the frequency (or slower if you decrease it)? >> >> >> Thanks, >> >> Paolo >