Thanks Marc for the reply. Please see inline below marked with [EH]. > -----Original Message----- > From: Marc Zyngier <maz@xxxxxxxxxx> > Sent: Monday, February 28, 2022 1:03 PM > To: Eugene Huang <eugeneh@xxxxxxxxxx> > Cc: kvmarm@xxxxxxxxxxxxxxxxxxxxx > Subject: Re: Timer delays in VM > > External email: Use caution opening links or attachments > > > [Please don't send HTML email and stick to plain text] > > On 2022-02-28 18:02, Eugene Huang wrote: > > Hi, > > > > I am running qemu on an arm64 CentOS host. Inside a ubuntu VM, a > > I assume that by this you mean QEMU as the VMM for a KVM guest, right? [EH] Yes. > > > process runs a timer created using timer_t: > > > > ev.sigev_notify_function = m_callback; > > > > … > > > > timer_create(CLOCK_MONOTONIC, &ev, &m_timer_t); > > > > This timer sometimes has significant delays. For example, the 50 ms > > timer can have a callback delay of 100ms. > > > > I did a host kernel trace and see a lot of WFx kvm_exits, and the > > following events between kvm_exit and kvm_entry: > > > > kvm_exit > > > > kvm_wfx_arm64 > > > > kvm_get_timer_map > > > > sched_switch > > > > kvm_timer_save_state > > > > kvm_timer_update_irq > > > > vgic_update_irq_pending > > > > kvm_timer_restore_state > > > > kvm_vcpu_wakeup > > > > kvm_arm_setup_debug > > > > kvm_arm_set_dreg32 > > > > kvm_entry > > All of this is perfectly normal (guest hits WFI from its idle loop, no interrupt is > pending, trap to EL2, schedule out, schedule back in, reenter the guest). > > > > > I have the following questions: > > > > * Why there are a lot WFx exits? Is the timer dependent on it? > > That's most probably because your vcpu goes idle and execute WFI to Wait > For an Interrupt. As no interrupt is pending, the vcpu exits so that the host > can do something useful until it gets an interrupt that is targeted at the vcpu. > On an idle VM, this probably happens 100s of times a second. > > > * Does this timer rely on kvm timer irq injection? > > Yes. A timer interrupt is always injected in SW. But the timer interrupt can > either come from the HW timer itself (the VM was running while the timer > expired), or from a SW timer that KVM as setup if the guest was blocked on > WFI. <EH> Here for arm64, EL1Virtual Timer is used. EL1 Virtual Timer is a HW timer, correct? There is an armvtimer implementation in QEMU 6.1+. Does this armvtimer make a difference? > > > * What can be any possible causes for the timer delay? Are there > > some locking mechanisms which can cause the delay? > > This completely depend on how loaded your host is, the respective priorities > of the various processes, and a million of other things. > This is no different from the same userspace running on the host. > It also depends on the *guest* kernel, by the way. <EH> Our guest kernel is 5.4. How is the *guest* kernel involved? Can you give an example? Do you have suggestions on the guest kernel version as well. > > There are of course locks all over the place, but that's the very nature of the > beast. > > > * What parameters can tune this timer? > > None. You may want to check whether the delay is observed when the VM > has hit WFI or not. <EH> Yes, delay is observed after vm_exit because of WFx (not sure WFI or WFE) but only when on a different vCPU in the same VM some workload is started. Since we pin that workload to its own vCPU, in theory, it should not affect the timing of another vCPU. > > You also don't mention what host kernel version you are running. > In general, please try and reproduce the issue using the latest kernel version > (5.16 at the moment). Please also indicate what HW you are using. <EH> Tried 5.15 and 5.4 kernels. Both have the issue. Do you think 5.16 can make a difference? The HW is an Ampere Altra system. > > Thanks, > > M. > -- > Jazz is not dead. It just smells funny... _______________________________________________ kvmarm mailing list kvmarm@xxxxxxxxxxxxxxxxxxxxx https://lists.cs.columbia.edu/mailman/listinfo/kvmarm