> -----Original Message----- > From: Marc Zyngier <maz@xxxxxxxxxx> > Sent: Tuesday, March 1, 2022 11:29 PM > To: Eugene Huang <eugeneh@xxxxxxxxxx> > Cc: kvmarm@xxxxxxxxxxxxxxxxxxxxx > Subject: Re: Timer delays in VM > > > On Tue, 01 Mar 2022 19:03:33 +0000, > Eugene Huang <eugeneh@xxxxxxxxxx> wrote: > > > > > > * Does this timer rely on kvm timer irq injection? > > > > > > Yes. A timer interrupt is always injected in SW. But the timer > > > interrupt can either come from the HW timer itself (the VM was > > > running while the timer expired), or from a SW timer that KVM as > > > setup if the guest was blocked on WFI. > > > > <EH> Here for arm64, EL1Virtual Timer is used. EL1 Virtual Timer is a > > HW timer, correct? There is an armvtimer implementation in QEMU 6.1+. > > Does this armvtimer make a difference? > > KVM only deals with the EL1 timers (both physical and virtual). I guess that by > 'armvtimer', you mean libvirt's front-end for the stolen time feature to > expose to the guest how wall clock and CPU time diverge (i.e. it isn't a timer > at all, but a dynamic correction for it). <EH> Yes, I mean the libvirt front-end setting. Okay, got it. Thanks. > > > > > * What can be any possible causes for the timer delay? Are > > > > there some locking mechanisms which can cause the delay? > > > > > > This completely depend on how loaded your host is, the respective > > > priorities of the various processes, and a million of other things. > > > This is no different from the same userspace running on the host. > > > It also depends on the *guest* kernel, by the way. > > > > <EH> Our guest kernel is 5.4. How is the *guest* kernel involved? > > Can you give an example? Do you have suggestions on the guest kernel > > version as well. > > It is the guest kernel that programs the timer, and KVM isn't involved at all, > specially on your HW (direct access to both timers on VHE-capable systems). > > > > > * What parameters can tune this timer? > > > > > > None. You may want to check whether the delay is observed when the > > > VM has hit WFI or not. > > > > <EH> Yes, delay is observed after vm_exit because of WFx (not sure WFI > > or WFE) but only when on a different vCPU in the same VM some workload > > is started. > > Let me see if I understand what you mean: > > - vcpu-0 is running your timer test, everything is fine > - vcpu-1 starts some other workload, and this affects the timer test > on the other vcpu > > Is that correct? It so, this would tend to indicate that both vcpu share some > physical resources such as a physical CPU. How do you run your VM? <EH> We have the following further 1-to-1 mappings: pcpu-20 - vcpu-0 is running your timer test, everything is fine pcpu-21 - vcpu-1 starts some other workload, and this affects the timer test on the other vcpu - Each vCPU thread is pinned to its individual pCPU on the host (vcpupin in libvirt). - Each pCPU on which a vCPU thread runs is isolated on the host (isolcpus). - Each vCPU that runs the workload is isolated in the guest VM (isolcpus). So we are pretty sure the workloads are separated. > > Also, please work out whether you exit because of a blocking WFI or WFE, as > they are indicative of different guest behaviour. <EH> Will do. Somehow our current trace does not show this information. > > > Since we pin that workload to its own vCPU, in theory, it should not > > affect the timing of another vCPU. > > Why not? a vcpu is just a host thread, and if they share a physical CPU at > some point, there is a knock-on effect. <EH> Again, because of vcpupin in libvirt, there is no sharing of a pCPU among vCPUs. At least that is our configuration intention. > > > > You also don't mention what host kernel version you are running. > > > In general, please try and reproduce the issue using the latest > > > kernel version > > > (5.16 at the moment). Please also indicate what HW you are using. > > > > <EH> Tried 5.15 and 5.4 kernels. Both have the issue. Do you think > > 5.16 can make a difference? The HW is an Ampere Altra system. > > Unlikely. The Altra is a mostly sane system, as long as you make sure that > VMs don't migrate across sockets (at which point it becomes laughably bad). > Nothing to do with KVM though. <EH> Right, there is no migration of VMs. I see kvm arm timer related code is very different between 5.4 and 5.15/5.16. Can we still use 5.4 for both the host and the guest? > > Are these kernels compiled from scratch? Or are they whatever the distro > ships? Same question for the guest. <EH> Yes. Both host and guest kernels are compiled from scratch. Thanks, Eugene > > Thanks, > > M. > > -- > Without deviation from the norm, progress is not possible. _______________________________________________ kvmarm mailing list kvmarm@xxxxxxxxxxxxxxxxxxxxx https://lists.cs.columbia.edu/mailman/listinfo/kvmarm