RE: Timer delays in VM

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



> -----Original Message-----
> From: Marc Zyngier <maz@xxxxxxxxxx>
> Sent: Tuesday, March 1, 2022 11:29 PM
> To: Eugene Huang <eugeneh@xxxxxxxxxx>
> Cc: kvmarm@xxxxxxxxxxxxxxxxxxxxx
> Subject: Re: Timer delays in VM
> 
> 
> On Tue, 01 Mar 2022 19:03:33 +0000,
> Eugene Huang <eugeneh@xxxxxxxxxx> wrote:
> >
> > > >       * Does this timer rely on kvm timer irq injection?
> > >
> > > Yes. A timer interrupt is always injected in SW. But the timer
> > > interrupt can either come from the HW timer itself (the VM was
> > > running while the timer expired), or from a SW timer that KVM as
> > > setup if the guest was blocked on WFI.
> >
> > <EH> Here for arm64, EL1Virtual Timer is used. EL1 Virtual Timer is a
> > HW timer, correct?  There is an armvtimer implementation in QEMU 6.1+.
> > Does this armvtimer make a difference?
> 
> KVM only deals with the EL1 timers (both physical and virtual). I guess that by
> 'armvtimer', you mean libvirt's front-end for the stolen time feature to
> expose to the guest how wall clock and CPU time diverge (i.e. it isn't a timer
> at all, but a dynamic correction for it).

<EH> Yes, I mean the libvirt front-end setting.  Okay, got it. Thanks.

> 
> > > >       * What can be any possible causes for the timer delay? Are
> > > > there some locking mechanisms which can cause the delay?
> > >
> > > This completely depend on how loaded your host is, the respective
> > > priorities of the various processes, and a million of other things.
> > > This is no different from the same userspace running on the host.
> > > It also depends on the *guest* kernel, by the way.
> >
> > <EH> Our guest kernel is 5.4. How is the *guest* kernel involved?
> > Can you give an example? Do you have suggestions on the guest kernel
> > version as well.
> 
> It is the guest kernel that programs the timer, and KVM isn't involved at all,
> specially on your HW (direct access to both timers on VHE-capable systems).
> 
> > > >       * What parameters can tune this timer?
> > >
> > > None. You may want to check whether the delay is observed when the
> > > VM has hit WFI or not.
> >
> > <EH> Yes, delay is observed after vm_exit because of WFx (not sure WFI
> > or WFE) but only when on a different vCPU in the same VM some workload
> > is started.
> 
> Let me see if I understand what you mean:
> 
> - vcpu-0 is running your timer test, everything is fine
> - vcpu-1 starts some other workload, and this affects the timer test
>   on the other vcpu
> 
> Is that correct? It so, this would tend to indicate that both vcpu share some
> physical resources such as a physical CPU. How do you run your VM?

<EH> We have the following further 1-to-1 mappings:
pcpu-20 - vcpu-0 is running your timer test, everything is fine
pcpu-21 - vcpu-1 starts some other workload, and this affects the timer test
on the other vcpu

- Each vCPU thread is pinned to its individual pCPU on the host (vcpupin in libvirt).
- Each pCPU on which a vCPU thread runs is isolated on the host (isolcpus).
- Each vCPU that runs the workload is isolated in the guest VM (isolcpus).

So we are pretty sure the workloads are separated.

> 
> Also, please work out whether you exit because of a blocking WFI or WFE, as
> they are indicative of different guest behaviour.

<EH> Will do. Somehow our current trace does not show this information.

> 
> > Since we pin that workload to its own vCPU, in theory, it should not
> > affect the timing of another vCPU.
> 
> Why not? a vcpu is just a host thread, and if they share a physical CPU at
> some point, there is a knock-on effect.

<EH> Again, because of vcpupin in libvirt, there is no sharing of a pCPU among vCPUs. At least that is our configuration intention.

> 
> > > You also don't mention what host kernel version you are running.
> > > In general, please try and reproduce the issue using the latest
> > > kernel version
> > > (5.16 at the moment). Please also indicate what HW you are using.
> >
> > <EH> Tried 5.15 and 5.4 kernels. Both have the issue. Do you think
> > 5.16 can make a difference? The HW is an Ampere Altra system.
> 
> Unlikely. The Altra is a mostly sane system, as long as you make sure that
> VMs don't migrate across sockets (at which point it becomes laughably bad).
> Nothing to do with KVM though.

<EH> Right, there is no migration of VMs.
I see kvm arm timer related code is very different between 5.4 and 5.15/5.16.  Can we still use 5.4 for both the host and the guest?

> 
> Are these kernels compiled from scratch? Or are they whatever the distro
> ships? Same question for the guest.

<EH> Yes. Both host and guest kernels are compiled from scratch. 

Thanks,
Eugene

> 
> Thanks,
> 
>         M.
> 
> --
> Without deviation from the norm, progress is not possible.
_______________________________________________
kvmarm mailing list
kvmarm@xxxxxxxxxxxxxxxxxxxxx
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm



[Index of Archives]     [Linux KVM]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux