Re: Timing issue with QEMU/KVM and OpenBSD

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



That mode moves the x86 PIC, PIT, and IOAPIC up to userspace. I'm not
particularly familiar with OpenBSD so I can't claim to know how these
interact. If it's heavily reliant on the PIT for time, I could imagine
things going wrong (though that is just speculation on my part).

On Wed, Jan 10, 2018 at 3:30 PM, Andrew Davis <adavis@xxxxxxxxxxxxxxx> wrote:
> Hello again,
>
> Just a follow-up with some more info. One of my vendors suggested a
> different workaround for the clock issue.
>
> "-machine accel=kvm,kernel_irqchip=off"
>
> Setting "kernel_irqchip=off" seems to resolve the timing issue. I ran this
> for roughly 2 hrs on my test hardware and didn't experience the timing issue
> (normally it becomes present after 10 mins). However, as far as I
> understand, this flag moves some interrupt handling to a userspace emulator
> which results in lower overall performance.
>
> This is what I noticed with the different irqchip modes:
>
> Without kernel_irqchip specified:    timing issue became present after 10
> mins.
> With kernel_irqchip=off:             no timing issue present, ran for 2 hrs.
> With kernel_irqchip=split:           timing issue became present after 9
> mins.
>
> Best regards,
> Andrew
>
>
>
> On 1/3/2018 7:19 PM, Paolo Bonzini wrote:
>>
>> On 04/01/2018 01:01, Andrew Davis wrote:
>>>
>>> Hello again,
>>>
>>> I'm glad to see that you were able to make sense of those test samples
>>> :).
>>>
>>> Is that behavior with the preemption timer normal at the KVM level? (I'm
>>> guessing not!).
>>
>> Indeed, not at all. :)  It's probably some kind of rounding error that
>> gets bigger and bigger, one bit at a time, or an overflow.
>>
>> Paolo
>>
>>> OpenBSD has timer tick sysctl variables, but they can't be changed
>>> without recompiling the kernel. Here is what the default layout looks
>>> like on OpenBSD 6.1:
>>>
>>>      # sysctl -a | grep -i clockrate
>>>      kern.clockrate=tick = 10000, tickadj = 40, hz = 100, profhz = 100,
>>> stathz = 100
>>>
>>> Best regards,
>>> Andrew
>>>
>>>
>>> On 1/2/2018 7:02 PM, Paolo Bonzini wrote:
>>>>
>>>> On 02/01/2018 20:02, Andrew Davis wrote:
>>>>>
>>>>> Here's the post-boot/2sec/4sec experiment run again, with "trace-cmd
>>>>> record -e kvm".
>>>>>
>>>>> http://45.63.6.248/kvm3/trace-01-boot.tar.gz
>>>>> http://45.63.6.248/kvm3/trace-02-2sec.tar.gz
>>>>> http://45.63.6.248/kvm3/trace-03-4sec.tar.gz
>>>>> http://45.63.6.248/kvm3/env.txt
>>>>
>>>> That's pretty interesting.  The normal trace looks like this (one CPU
>>>> only and with some traces omitted):
>>>>
>>>>                CPU-1135  [014] 440078.978674: kvm_exit:
>>>> reason HLT rip 0xffffffff816a0b6b info 0 0
>>>>                CPU-1135  [014] 440078.988645: kvm_vcpu_wakeup:
>>>> wait time 9967011 ns, polling valid
>>>>                CPU-1135  [014] 440078.988660: kvm_exit:
>>>> reason HLT rip 0xffffffff816a0b6b info 0 0
>>>>                CPU-1135  [014] 440078.998645: kvm_vcpu_wakeup:
>>>> wait time 9980730 ns, polling valid
>>>>                CPU-1135  [014] 440078.998660: kvm_exit:
>>>> reason HLT rip 0xffffffff816a0b6b info 0 0
>>>>                CPU-1135  [014] 440079.008646: kvm_vcpu_wakeup:
>>>> wait time 9982416 ns, polling valid
>>>>                CPU-1135  [014] 440079.008664: kvm_exit:
>>>> reason HLT rip 0xffffffff816a0b6b info 0 0
>>>>                CPU-1135  [014] 440079.018644: kvm_vcpu_wakeup:
>>>> wait time 9976479 ns, polling valid
>>>>                CPU-1135  [014] 440079.018659: kvm_exit:
>>>> reason HLT rip 0xffffffff816a0b6b info 0 0
>>>>                CPU-1135  [014] 440079.028644: kvm_vcpu_wakeup:
>>>> wait time 9981113 ns, polling valid
>>>>                CPU-1135  [014] 440079.028659: kvm_exit:
>>>> reason HLT rip 0xffffffff816a0b6b info 0 0
>>>>                CPU-1135  [014] 440079.038648: kvm_vcpu_wakeup:
>>>> wait time 9985185 ns, polling valid
>>>>                CPU-1135  [014] 440079.038663: kvm_exit:
>>>> reason HLT rip 0xffffffff816a0b6b info 0 0
>>>>                CPU-1135  [014] 440079.048646: kvm_vcpu_wakeup:
>>>> wait time 9979267 ns, polling valid
>>>>
>>>> Then, it looks like this:
>>>>
>>>>                CPU-1135  [027] 440530.308689: kvm_vcpu_wakeup:
>>>> wait time 19968272 ns, polling valid
>>>>                CPU-1135  [027] 440530.308696: kvm_exit:
>>>> reason PREEMPTION_TIMER rip 0xffffffff816a0b6c info 0 0
>>>>                CPU-1135  [027] 440530.308704: kvm_exit:
>>>> reason HLT rip 0xffffffff816a0b6b info 0 0
>>>>                CPU-1135  [027] 440530.328690: kvm_vcpu_wakeup:
>>>> wait time 19984070 ns, polling valid
>>>>                CPU-1135  [027] 440530.328698: kvm_exit:
>>>> reason PREEMPTION_TIMER rip 0xffffffff816a0b6c info 0 0
>>>>                CPU-1135  [027] 440530.328705: kvm_exit:
>>>> reason HLT rip 0xffffffff816a0b6b info 0 0
>>>>                CPU-1135  [027] 440530.348691: kvm_vcpu_wakeup:
>>>> wait time 19982890 ns, polling valid
>>>>                CPU-1135  [027] 440530.348699: kvm_exit:
>>>> reason PREEMPTION_TIMER rip 0xffffffff816a0b6c info 0 0
>>>>                CPU-1135  [027] 440530.348707: kvm_exit:
>>>> reason HLT rip 0xffffffff816a0b6b info 0 0
>>>>                CPU-1135  [027] 440530.368691: kvm_vcpu_wakeup:
>>>> wait time 19981827 ns, polling valid
>>>>                CPU-1135  [027] 440530.368699: kvm_exit:
>>>> reason PREEMPTION_TIMER rip 0xffffffff816a0b6c info 0 0
>>>>                CPU-1135  [027] 440530.368707: kvm_exit:
>>>> reason HLT rip 0xffffffff816a0b6b info 0 0
>>>>                CPU-1135  [027] 440530.388692: kvm_vcpu_wakeup:
>>>> wait time 19981959 ns, polling valid
>>>>
>>>> And then it looks like this:
>>>>
>>>>                CPU-1135  [014] 441652.963757: kvm_exit:
>>>> reason PREEMPTION_TIMER rip 0xffffffff816a0b6c info 0 0
>>>>                CPU-1135  [014] 441652.963764: kvm_exit:
>>>> reason PREEMPTION_TIMER rip 0xffffffff816a0b6c info 0 0
>>>>                CPU-1135  [014] 441652.963768: kvm_exit:
>>>> reason PREEMPTION_TIMER rip 0xffffffff816a0b6c info 0 0
>>>>                CPU-1135  [014] 441652.963775: kvm_exit:
>>>> reason HLT rip 0xffffffff816a0b6b info 0 0
>>>>                CPU-1135  [014] 441653.003745: kvm_vcpu_wakeup:
>>>> wait time 39964879 ns, polling valid
>>>>                CPU-1135  [014] 441653.003758: kvm_exit:
>>>> reason PREEMPTION_TIMER rip 0xffffffff816a0b6c info 0 0
>>>>                CPU-1135  [014] 441653.003765: kvm_exit:
>>>> reason PREEMPTION_TIMER rip 0xffffffff816a0b6c info 0 0
>>>>                CPU-1135  [014] 441653.003772: kvm_exit:
>>>> reason PREEMPTION_TIMER rip 0xffffffff816a0b6c info 0 0
>>>>                CPU-1135  [014] 441653.003783: kvm_exit:
>>>> reason HLT rip 0xffffffff816a0b6b info 0 0
>>>>                CPU-1135  [014] 441653.043745: kvm_vcpu_wakeup:
>>>> wait time 39957085 ns, polling valid
>>>>                CPU-1135  [014] 441653.043765: kvm_exit:
>>>> reason PREEMPTION_TIMER rip 0xffffffff816a0b6c info 0 0
>>>>                CPU-1135  [014] 441653.043770: kvm_exit:
>>>> reason PREEMPTION_TIMER rip 0xffffffff816a0b6c info 0 0
>>>>                CPU-1135  [014] 441653.043774: kvm_exit:
>>>> reason PREEMPTION_TIMER rip 0xffffffff816a0b6c info 0 0
>>>>                CPU-1135  [014] 441653.043783: kvm_exit:
>>>> reason HLT rip 0xffffffff816a0b6b info 0 0
>>>>                CPU-1135  [014] 441653.083734: kvm_vcpu_wakeup:
>>>> wait time 39945744 ns, polling valid
>>>>
>>>> So the preemption timer is set to a very small value after some time.
>>>> This causes
>>>> the deadline to be advanced, and the next HLT to wake up after twice
>>>> the correct
>>>> deadline.
>>>>
>>>> Then the premption timer is set to a very small value for three times
>>>> in a row,
>>>> each of which advances the deadline; the next HLT then wakes up after
>>>> four times
>>>> the regular time.
>>>>
>>>> This validates your observation that disabling the preemption timer
>>>> works around
>>>> the bug.  I'll take a look and try to create a test case.  Out of
>>>> curiosity, can
>>>> the OpenBSD timer tick can be adjusted?  If so, does the bug trigger
>>>> faster if
>>>> you increase the frequency (or slower if you decrease it)?
>>>>
>>>>
>>>> Thanks,
>>>>
>>>> Paolo
>
>



[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux