[Android-virt] [PATCH 08/10] ARM: KVM: arch_timers: Add guest timer core support

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 25/06/12 22:15, Christoffer Dall wrote:
> On Mon, May 14, 2012 at 9:07 AM, Marc Zyngier <marc.zyngier at arm.com> wrote:
>> We can inject a timer interrupt into the guest as a result of
>> three possible events:
>> - The virtual timer interrupt has fired while we were still
>>  executing the guest
> 
> couldn't/shouldn't we wire this to the vGIC support through the HW bit
> in the LRs so it gets delivered directly to the guest without our
> intervention?

You really don't want to go down that road.

For that to work properly, you'd need to start adding priorities to the
host GIC code, so we can still preempt a host running in its interrupt
handler (and prevent it from DoS-ing the host).

And introducing priorities in the interrupt handling is a sure way to
start a nice flame war on LKML... ;-)

> that would take care of all the arch_timer_virt_external_handler stuff
> as well...

No. Because you have no HW distributor at the guest level, you'd still
have to take the interrupt at the host level, notice this is a guest
related interrupt, escape the normal IRQ flow, queue it into the guest
LRs. The only thing you gain is the automatic EOI from the guest to the
HW. All the rest is just pain.

>> - The timer interrupt hasn't fired, but it expired while we
>>  were doing the world switch
>> - A hrtimer we programmed earlier has fired
>>
>> Signed-off-by: Marc Zyngier <marc.zyngier at arm.com>
>> ---
>>  arch/arm/include/asm/kvm_arch_timer.h |   34 +++++++-
>>  arch/arm/kvm/timer.c                  |  158 +++++++++++++++++++++++++++++++++
>>  2 files changed, 191 insertions(+), 1 deletions(-)
>>  create mode 100644 arch/arm/kvm/timer.c
>>
>> diff --git a/arch/arm/include/asm/kvm_arch_timer.h b/arch/arm/include/asm/kvm_arch_timer.h
>> index 491e336..191c541 100644
>> --- a/arch/arm/include/asm/kvm_arch_timer.h
>> +++ b/arch/arm/include/asm/kvm_arch_timer.h
>> @@ -1,13 +1,45 @@
>>  #ifndef __ASM_ARM_KVM_ARCH_TIMER_H
>>  #define __ASM_ARM_KVM_ARCH_TIMER_H
>>
>> +#include <linux/clocksource.h>
>> +#include <linux/hrtimer.h>
>> +#include <linux/workqueue.h>
>> +
>>  struct arch_timer_kvm {
>> +#ifdef CONFIG_KVM_ARM_TIMER
>> +       bool                    enabled;
>> +       union {
>> +               cycle_t         cntvoff;
>> +               struct {
>> +                       u32     cntvoff_high; /* Restored only */
>> +                       u32     cntvoff_low;  /* Restored only */
>> +               } cntvoff32;
>> +       };
>> +#endif
>>  };
>>
>>  struct arch_timer_cpu {
>> +#ifdef CONFIG_KVM_ARM_TIMER
>> +       struct hrtimer          timer;
>> +       struct work_struct      expired;
>> +       cycle_t                 cval;
>> +       bool                    armed;
>> +
>> +       /* Registers */
>> +       u32                     cntv_ctl;       /* Saved/restored */
>> +       u32                     cntv_cval_high; /* Saved/restored */
>> +       u32                     cntv_cval_low;  /* Saved/restored */
>> +#endif
>>  };
> 
> Could you document these structures a little?
> 
> It's hard to tell why something goes on the per-vm struct and why
> something goes on the per-vcpu struct if you're not super familier
> with the generic timer architecture. For example, it's not clear if
> the enabled field in the arch_timer_kvm struct corresponds to some bit
> in hardware set by the guest or if it says if we use arch timers in
> the guest. I'm pretty sure it means that we initialized everything
> correctly, we use the vgic, and we can therefore use the features for
> guests...

Fair enough. I'll add some documentation to that effect.

> also took me annoyingly long time to figure out that 'armed' means
> hrtimer armed outside of guest, not related to guest setting a
> timer...

Well, it is related to the guest setting a timer. It acts as a proxy
timer for the guest.

>> diff --git a/arch/arm/kvm/timer.c b/arch/arm/kvm/timer.c
>> new file mode 100644
>> index 0000000..a342f37
>> --- /dev/null
>> +++ b/arch/arm/kvm/timer.c
>> @@ -0,0 +1,158 @@
>> +/*
>> + * Copyright (C) 2012 ARM Ltd.
>> + * Author: Marc Zyngier <marc.zyngier at arm.com>
>> + *
>> + * This program is free software; you can redistribute it and/or modify
>> + * it under the terms of the GNU General Public License version 2 as
>> + * published by the Free Software Foundation.
>> + *
>> + * This program is distributed in the hope that it will be useful,
>> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
>> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
>> + * GNU General Public License for more details.
>> + *
>> + * You should have received a copy of the GNU General Public License
>> + * along with this program; if not, write to the Free Software
>> + * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
>> + */
>> +
>> +#include <linux/kvm.h>
>> +#include <linux/kvm_host.h>
>> +#include <linux/interrupt.h>
>> +
>> +#include <asm/arch_timer.h>
>> +
>> +#include <asm/kvm_vgic.h>
>> +#include <asm/kvm_arch_timer.h>
>> +
>> +static const struct kvm_irq_level virt_timer_ppi = {
>> +       .irq    = 27,   /* A7/A15 specific */
> 
> so should this come from a DT somehow or should we check which CPU
> we'er emulating (same as host) at runtime?

This is completely CPU specific, and should be provided by looking at
what we're actually emulating/running on.

>> +       .level  = 1,
>> +};
>> +
>> +static struct timecounter *timecounter;
>> +static struct workqueue_struct *wqueue;
> 
> why do we need our own workqueue here? (as opposed to just scheduling
> work on 'events')

It makes things slightly more predictable, and easier to instrument. x86
does the same.

>> +
>> +static cycle_t kvm_phys_timer_read(void)
>> +{
>> +       return timecounter->cc->read(timecounter->cc);
>> +}
>> +
>> +static void kvm_timer_inject_irq(struct kvm_vcpu *vcpu)
>> +{
>> +       struct arch_timer_cpu *timer = &vcpu->arch.timer_cpu;
>> +
>> +       timer->cntv_ctl |= 1 << 1; /* Mask the interrupt in the guest */
>> +       kvm_vgic_inject_irq(vcpu->kvm, vcpu->vcpu_id, &virt_timer_ppi);
>> +}
>> +
>> +static irqreturn_t kvm_arch_timer_handler(int irq, void *dev_id)
>> +{
>> +       struct kvm_vcpu *vcpu = kvm_vgic_get_current_vcpu();
>> +
>> +       if (WARN_ON(!vcpu))
>> +               return IRQ_NONE;
>> +
>> +       kvm_timer_inject_irq(vcpu);
>> +       return IRQ_HANDLED;
>> +}
>> +
>> +static void kvm_timer_inject_irq_work(struct work_struct *work)
>> +{
>> +       struct kvm_vcpu *vcpu;
>> +
>> +       vcpu = container_of(work, struct kvm_vcpu, arch.timer_cpu.expired);
>> +       vcpu->arch.timer_cpu.armed = false;
>> +       kvm_timer_inject_irq(vcpu);
>> +}
>> +
>> +static enum hrtimer_restart kvm_timer_expire(struct hrtimer *hrt)
>> +{
>> +       struct arch_timer_cpu *timer;
>> +       timer = container_of(hrt, struct arch_timer_cpu, timer);
>> +       queue_work(wqueue, &timer->expired);
>> +       return HRTIMER_NORESTART;
>> +}
>> +
>> +void kvm_timer_sync_to_cpu(struct kvm_vcpu *vcpu)
>> +{
>> +}
>> +
>> +void kvm_timer_sync_from_cpu(struct kvm_vcpu *vcpu)
>> +{
>> +       struct arch_timer_cpu *timer = &vcpu->arch.timer_cpu;
>> +       cycle_t cval, now;
>> +       u64 ns;
>> +
>> +       /* Check if the timer is enabled and unmasked first */
>> +       if ((timer->cntv_ctl & 3) != 1)
>> +               return;
>> +
>> +       cval = ((cycle_t)timer->cntv_cval_high << 32) | timer->cntv_cval_low;
>> +       now = kvm_phys_timer_read() - vcpu->kvm->arch.timer.cntvoff;
> 
> when is this adjusted and according to what?

cntvoff is never adjusted. It is sampled exactly once when the VM is
created, and that's about it. And actually, the feature is disabled for
the time being.

>> +
>> +       if (timer->armed) {
>> +               if (cval == timer->cval)
>> +                       return;         /* Already programmed */
>> +               hrtimer_cancel(&timer->timer);
>> +               cancel_work_sync(&timer->expired);
>> +               timer->armed = false;
>> +       }
>> +
>> +       if (cval <= now) {
>> +               /*
>> +                * Timer has already expired while we were not
>> +                * looking. Inject the interrupt and carry on.
>> +                */
>> +               kvm_timer_inject_irq(vcpu);
>> +               return;
>> +       }
>> +
>> +       timer->cval = cval;
>> +       timer->armed = true;
>> +       ns = cyclecounter_cyc2ns(timecounter->cc, cval - now);
>> +       hrtimer_start(&timer->timer, ktime_add_ns(ktime_get(), ns),
>> +                     HRTIMER_MODE_ABS);
>> +}
>> +
>> +void kvm_timer_vcpu_init(struct kvm_vcpu *vcpu)
>> +{
>> +       struct arch_timer_cpu *timer = &vcpu->arch.timer_cpu;
>> +
>> +       INIT_WORK(&timer->expired, kvm_timer_inject_irq_work);
>> +       hrtimer_init(&timer->timer, CLOCK_MONOTONIC, HRTIMER_MODE_ABS);
>> +       timer->timer.function = kvm_timer_expire;
>> +}
>> +
>> +int kvm_timer_hyp_init(void)
>> +{
>> +       timecounter = arch_timer_get_timecounter();
>> +       if (!timecounter)
>> +               return -ENODEV;
>> +
>> +       wqueue = create_singlethread_workqueue("kvm_arch_timer");
>> +       if (!wqueue)
>> +               return -ENOMEM;
>> +
>> +       arch_timer_switch_to_phys(kvm_arch_timer_handler);
>> +       return 0;
>> +}
>> +
>> +void kvm_timer_vcpu_terminate(struct kvm_vcpu *vcpu)
>> +{
>> +       struct arch_timer_cpu *timer = &vcpu->arch.timer_cpu;
>> +
>> +       hrtimer_cancel(&timer->timer);
>> +       cancel_work_sync(&timer->expired);
>> +}
>> +
>> +int kvm_timer_init(struct kvm *kvm)
>> +{
>> +#if 0
>> +       kvm->arch.timer.cntvoff = kvm_phys_timer_read();
>> +#endif
> 
> what's this?

Workaround for a bug in the model, and a new version should be released
shortly. HW doesn't have this issue, fortunately. Once the new model is
in the wild, we can enable this and get proper virtual offset.

	M.
-- 
Jazz is not dead. It just smells funny...




[Index of Archives]     [Linux KVM]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux