On 25/04/17 13:51, Daniel Lezcano wrote: > On Tue, Apr 25, 2017 at 11:21:21AM +0100, Marc Zyngier wrote: >> On 25/04/17 10:49, Daniel Lezcano wrote: >>> On Tue, Apr 25, 2017 at 10:10:12AM +0100, Marc Zyngier wrote: >> >> [...] >> >>>>> +static inline void setup_timings(struct irq_desc *desc, struct irqaction *act) >>>>> +{ >>>>> + /* >>>>> + * We don't need the measurement because the idle code already >>>>> + * knows the next expiry event. >>>>> + */ >>>>> + if (act->flags & __IRQF_TIMER) >>>>> + return; >>>> >>>> And that's where this is really wrong for the KVM guest timer. As I >>>> said, this timer is under complete control of the guest, and the rest of >>>> the system doesn't know about it. KVM itself will only find out when the >>>> vcpu does a VM exit for a reason or another, and will just save/restore >>>> the state in order to be able to give the timer to another guest. >>>> >>>> The idle code is very much *not* aware of anything concerning that guest >>>> timer. >>> >>> Just for my own curiosity, if there are two VM (VM1 and VM2). VM1 sets a timer1 >>> at <time> and exits, VM2 runs and sets a timer2 at <time+delta>. >>> >>> The timer1 for VM1 is supposed to expire while VM2 is running. IIUC the virtual >>> timer is under control of VM2 and will expire at <time+delta>. >>> >>> Is the host wake up with the SW timer and switch in VM1 which in turn restores >>> the timer and jump in the virtual timer irq handler? >> >> Indeed. The SW timer causes VM1 to wake-up, either on the same CPU >> (preempting VM2) or on another. The timer is then restored with the >> pending virtual interrupt injected, and the guest does what it has to >> with it. > > Thanks for clarification. > > So there is a virtual timer with real registers / interruption (waking up the > host) for the running VMs and SW timers for non-running VMs. > > What is the benefit of having such mechanism instead of real timers injecting > interrupts in the VM without the virtual timer + save/restore? Efficiency in > the running VMs when setting up timers (saving privileges change overhead)? You can't dedicate HW resources to virtual CPUs. It just doesn't scale. Also, injecting HW interrupts in a guest is pretty hard work, and for multiple reasons: - the host needs to be in control of interrupt delivery (don't hog the CPU with guest interrupts) - you want to be able to remap interrupts (id X on the host becomes id Y on the guest), - you want to deal with migrating vcpus, - you want deliver an interrupt to a vcpu that is *not* running. It *is* doable, but it is not cheap at all from a HW point of view. M. -- Jazz is not dead. It just smells funny...