[Android-virt] [PATCH 08/10] ARM: KVM: arch_timers: Add guest timer core support

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Jun 26, 2012 at 9:21 AM, Marc Zyngier <marc.zyngier at arm.com> wrote:
> On 25/06/12 22:15, Christoffer Dall wrote:
>> On Mon, May 14, 2012 at 9:07 AM, Marc Zyngier <marc.zyngier at arm.com> wrote:
>>> We can inject a timer interrupt into the guest as a result of
>>> three possible events:
>>> - The virtual timer interrupt has fired while we were still
>>> ?executing the guest
>>
>> couldn't/shouldn't we wire this to the vGIC support through the HW bit
>> in the LRs so it gets delivered directly to the guest without our
>> intervention?
>
> You really don't want to go down that road.
>
> For that to work properly, you'd need to start adding priorities to the
> host GIC code, so we can still preempt a host running in its interrupt
> handler (and prevent it from DoS-ing the host).
>
> And introducing priorities in the interrupt handling is a sure way to
> start a nice flame war on LKML... ;-)
>

ok, I obviously didn't know what I was asking here ;)

>> that would take care of all the arch_timer_virt_external_handler stuff
>> as well...
>
> No. Because you have no HW distributor at the guest level, you'd still
> have to take the interrupt at the host level, notice this is a guest
> related interrupt, escape the normal IRQ flow, queue it into the guest
> LRs. The only thing you gain is the automatic EOI from the guest to the
> HW. All the rest is just pain.
>
>>> - The timer interrupt hasn't fired, but it expired while we
>>> ?were doing the world switch
>>> - A hrtimer we programmed earlier has fired
>>>
>>> Signed-off-by: Marc Zyngier <marc.zyngier at arm.com>
>>> ---
>>> ?arch/arm/include/asm/kvm_arch_timer.h | ? 34 +++++++-
>>> ?arch/arm/kvm/timer.c ? ? ? ? ? ? ? ? ?| ?158 +++++++++++++++++++++++++++++++++
>>> ?2 files changed, 191 insertions(+), 1 deletions(-)
>>> ?create mode 100644 arch/arm/kvm/timer.c
>>>
>>> diff --git a/arch/arm/include/asm/kvm_arch_timer.h b/arch/arm/include/asm/kvm_arch_timer.h
>>> index 491e336..191c541 100644
>>> --- a/arch/arm/include/asm/kvm_arch_timer.h
>>> +++ b/arch/arm/include/asm/kvm_arch_timer.h
>>> @@ -1,13 +1,45 @@
>>> ?#ifndef __ASM_ARM_KVM_ARCH_TIMER_H
>>> ?#define __ASM_ARM_KVM_ARCH_TIMER_H
>>>
>>> +#include <linux/clocksource.h>
>>> +#include <linux/hrtimer.h>
>>> +#include <linux/workqueue.h>
>>> +
>>> ?struct arch_timer_kvm {
>>> +#ifdef CONFIG_KVM_ARM_TIMER
>>> + ? ? ? bool ? ? ? ? ? ? ? ? ? ?enabled;
>>> + ? ? ? union {
>>> + ? ? ? ? ? ? ? cycle_t ? ? ? ? cntvoff;
>>> + ? ? ? ? ? ? ? struct {
>>> + ? ? ? ? ? ? ? ? ? ? ? u32 ? ? cntvoff_high; /* Restored only */
>>> + ? ? ? ? ? ? ? ? ? ? ? u32 ? ? cntvoff_low; ?/* Restored only */
>>> + ? ? ? ? ? ? ? } cntvoff32;
>>> + ? ? ? };
>>> +#endif
>>> ?};
>>>
>>> ?struct arch_timer_cpu {
>>> +#ifdef CONFIG_KVM_ARM_TIMER
>>> + ? ? ? struct hrtimer ? ? ? ? ?timer;
>>> + ? ? ? struct work_struct ? ? ?expired;
>>> + ? ? ? cycle_t ? ? ? ? ? ? ? ? cval;
>>> + ? ? ? bool ? ? ? ? ? ? ? ? ? ?armed;
>>> +
>>> + ? ? ? /* Registers */
>>> + ? ? ? u32 ? ? ? ? ? ? ? ? ? ? cntv_ctl; ? ? ? /* Saved/restored */
>>> + ? ? ? u32 ? ? ? ? ? ? ? ? ? ? cntv_cval_high; /* Saved/restored */
>>> + ? ? ? u32 ? ? ? ? ? ? ? ? ? ? cntv_cval_low; ?/* Saved/restored */
>>> +#endif
>>> ?};
>>
>> Could you document these structures a little?
>>
>> It's hard to tell why something goes on the per-vm struct and why
>> something goes on the per-vcpu struct if you're not super familier
>> with the generic timer architecture. For example, it's not clear if
>> the enabled field in the arch_timer_kvm struct corresponds to some bit
>> in hardware set by the guest or if it says if we use arch timers in
>> the guest. I'm pretty sure it means that we initialized everything
>> correctly, we use the vgic, and we can therefore use the features for
>> guests...
>
> Fair enough. I'll add some documentation to that effect.
>
>> also took me annoyingly long time to figure out that 'armed' means
>> hrtimer armed outside of guest, not related to guest setting a
>> timer...
>
> Well, it is related to the guest setting a timer. It acts as a proxy
> timer for the guest.
>
>>> diff --git a/arch/arm/kvm/timer.c b/arch/arm/kvm/timer.c
>>> new file mode 100644
>>> index 0000000..a342f37
>>> --- /dev/null
>>> +++ b/arch/arm/kvm/timer.c
>>> @@ -0,0 +1,158 @@
>>> +/*
>>> + * Copyright (C) 2012 ARM Ltd.
>>> + * Author: Marc Zyngier <marc.zyngier at arm.com>
>>> + *
>>> + * This program is free software; you can redistribute it and/or modify
>>> + * it under the terms of the GNU General Public License version 2 as
>>> + * published by the Free Software Foundation.
>>> + *
>>> + * This program is distributed in the hope that it will be useful,
>>> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
>>> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. ?See the
>>> + * GNU General Public License for more details.
>>> + *
>>> + * You should have received a copy of the GNU General Public License
>>> + * along with this program; if not, write to the Free Software
>>> + * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
>>> + */
>>> +
>>> +#include <linux/kvm.h>
>>> +#include <linux/kvm_host.h>
>>> +#include <linux/interrupt.h>
>>> +
>>> +#include <asm/arch_timer.h>
>>> +
>>> +#include <asm/kvm_vgic.h>
>>> +#include <asm/kvm_arch_timer.h>
>>> +
>>> +static const struct kvm_irq_level virt_timer_ppi = {
>>> + ? ? ? .irq ? ?= 27, ? /* A7/A15 specific */
>>
>> so should this come from a DT somehow or should we check which CPU
>> we'er emulating (same as host) at runtime?
>
> This is completely CPU specific, and should be provided by looking at
> what we're actually emulating/running on.
>

ok, I'm working on patches for this, which determines this at runtime,
so we'll have to figure out a nice clean way to decide these things at
runtime.

>>> + ? ? ? .level ?= 1,
>>> +};
>>> +
>>> +static struct timecounter *timecounter;
>>> +static struct workqueue_struct *wqueue;
>>
>> why do we need our own workqueue here? (as opposed to just scheduling
>> work on 'events')
>
> It makes things slightly more predictable, and easier to instrument. x86
> does the same.
>

ok

>>> +
>>> +static cycle_t kvm_phys_timer_read(void)
>>> +{
>>> + ? ? ? return timecounter->cc->read(timecounter->cc);
>>> +}
>>> +
>>> +static void kvm_timer_inject_irq(struct kvm_vcpu *vcpu)
>>> +{
>>> + ? ? ? struct arch_timer_cpu *timer = &vcpu->arch.timer_cpu;
>>> +
>>> + ? ? ? timer->cntv_ctl |= 1 << 1; /* Mask the interrupt in the guest */
>>> + ? ? ? kvm_vgic_inject_irq(vcpu->kvm, vcpu->vcpu_id, &virt_timer_ppi);
>>> +}
>>> +
>>> +static irqreturn_t kvm_arch_timer_handler(int irq, void *dev_id)
>>> +{
>>> + ? ? ? struct kvm_vcpu *vcpu = kvm_vgic_get_current_vcpu();
>>> +
>>> + ? ? ? if (WARN_ON(!vcpu))
>>> + ? ? ? ? ? ? ? return IRQ_NONE;
>>> +
>>> + ? ? ? kvm_timer_inject_irq(vcpu);
>>> + ? ? ? return IRQ_HANDLED;
>>> +}
>>> +
>>> +static void kvm_timer_inject_irq_work(struct work_struct *work)
>>> +{
>>> + ? ? ? struct kvm_vcpu *vcpu;
>>> +
>>> + ? ? ? vcpu = container_of(work, struct kvm_vcpu, arch.timer_cpu.expired);
>>> + ? ? ? vcpu->arch.timer_cpu.armed = false;
>>> + ? ? ? kvm_timer_inject_irq(vcpu);
>>> +}
>>> +
>>> +static enum hrtimer_restart kvm_timer_expire(struct hrtimer *hrt)
>>> +{
>>> + ? ? ? struct arch_timer_cpu *timer;
>>> + ? ? ? timer = container_of(hrt, struct arch_timer_cpu, timer);
>>> + ? ? ? queue_work(wqueue, &timer->expired);
>>> + ? ? ? return HRTIMER_NORESTART;
>>> +}
>>> +
>>> +void kvm_timer_sync_to_cpu(struct kvm_vcpu *vcpu)
>>> +{
>>> +}
>>> +
>>> +void kvm_timer_sync_from_cpu(struct kvm_vcpu *vcpu)
>>> +{
>>> + ? ? ? struct arch_timer_cpu *timer = &vcpu->arch.timer_cpu;
>>> + ? ? ? cycle_t cval, now;
>>> + ? ? ? u64 ns;
>>> +
>>> + ? ? ? /* Check if the timer is enabled and unmasked first */
>>> + ? ? ? if ((timer->cntv_ctl & 3) != 1)
>>> + ? ? ? ? ? ? ? return;
>>> +
>>> + ? ? ? cval = ((cycle_t)timer->cntv_cval_high << 32) | timer->cntv_cval_low;
>>> + ? ? ? now = kvm_phys_timer_read() - vcpu->kvm->arch.timer.cntvoff;
>>
>> when is this adjusted and according to what?
>
> cntvoff is never adjusted. It is sampled exactly once when the VM is
> created, and that's about it. And actually, the feature is disabled for
> the time being.
>

ok, so the point is simply to say reasonably to the guest it's been
running for X time, not counting time when the guest was not
scheduled?

>>> +
>>> + ? ? ? if (timer->armed) {
>>> + ? ? ? ? ? ? ? if (cval == timer->cval)
>>> + ? ? ? ? ? ? ? ? ? ? ? return; ? ? ? ? /* Already programmed */
>>> + ? ? ? ? ? ? ? hrtimer_cancel(&timer->timer);
>>> + ? ? ? ? ? ? ? cancel_work_sync(&timer->expired);
>>> + ? ? ? ? ? ? ? timer->armed = false;
>>> + ? ? ? }
>>> +
>>> + ? ? ? if (cval <= now) {
>>> + ? ? ? ? ? ? ? /*
>>> + ? ? ? ? ? ? ? ?* Timer has already expired while we were not
>>> + ? ? ? ? ? ? ? ?* looking. Inject the interrupt and carry on.
>>> + ? ? ? ? ? ? ? ?*/
>>> + ? ? ? ? ? ? ? kvm_timer_inject_irq(vcpu);
>>> + ? ? ? ? ? ? ? return;
>>> + ? ? ? }
>>> +
>>> + ? ? ? timer->cval = cval;
>>> + ? ? ? timer->armed = true;
>>> + ? ? ? ns = cyclecounter_cyc2ns(timecounter->cc, cval - now);
>>> + ? ? ? hrtimer_start(&timer->timer, ktime_add_ns(ktime_get(), ns),
>>> + ? ? ? ? ? ? ? ? ? ? HRTIMER_MODE_ABS);
>>> +}
>>> +
>>> +void kvm_timer_vcpu_init(struct kvm_vcpu *vcpu)
>>> +{
>>> + ? ? ? struct arch_timer_cpu *timer = &vcpu->arch.timer_cpu;
>>> +
>>> + ? ? ? INIT_WORK(&timer->expired, kvm_timer_inject_irq_work);
>>> + ? ? ? hrtimer_init(&timer->timer, CLOCK_MONOTONIC, HRTIMER_MODE_ABS);
>>> + ? ? ? timer->timer.function = kvm_timer_expire;
>>> +}
>>> +
>>> +int kvm_timer_hyp_init(void)
>>> +{
>>> + ? ? ? timecounter = arch_timer_get_timecounter();
>>> + ? ? ? if (!timecounter)
>>> + ? ? ? ? ? ? ? return -ENODEV;
>>> +
>>> + ? ? ? wqueue = create_singlethread_workqueue("kvm_arch_timer");
>>> + ? ? ? if (!wqueue)
>>> + ? ? ? ? ? ? ? return -ENOMEM;
>>> +
>>> + ? ? ? arch_timer_switch_to_phys(kvm_arch_timer_handler);
>>> + ? ? ? return 0;
>>> +}
>>> +
>>> +void kvm_timer_vcpu_terminate(struct kvm_vcpu *vcpu)
>>> +{
>>> + ? ? ? struct arch_timer_cpu *timer = &vcpu->arch.timer_cpu;
>>> +
>>> + ? ? ? hrtimer_cancel(&timer->timer);
>>> + ? ? ? cancel_work_sync(&timer->expired);
>>> +}
>>> +
>>> +int kvm_timer_init(struct kvm *kvm)
>>> +{
>>> +#if 0
>>> + ? ? ? kvm->arch.timer.cntvoff = kvm_phys_timer_read();
>>> +#endif
>>
>> what's this?
>
> Workaround for a bug in the model, and a new version should be released
> shortly. HW doesn't have this issue, fortunately. Once the new model is
> in the wild, we can enable this and get proper virtual offset.
>

ok

-Christoffer



[Index of Archives]     [Linux KVM]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux