Hi Marc, On Thu, May 04, 2017 at 01:22:25PM +0200, Christoffer Dall wrote: > On Thu, May 04, 2017 at 11:54:06AM +0100, Marc Zyngier wrote: > > On 04/05/17 10:59, Christoffer Dall wrote: > > > On Thu, May 04, 2017 at 10:34:32AM +0100, Marc Zyngier wrote: > > >> On 03/05/17 19:33, Christoffer Dall wrote: > > >>> First we define an ABI using the vcpu devices that lets userspace set > > >>> the interrupt numbers for the various timers on both the 32-bit and > > >>> 64-bit KVM/ARM implementations. > > >>> > > >>> Second, we add the definitions for the groups and attributes introduced > > >>> by the above ABI. (We add the PMU define on the 32-bit side as well for > > >>> symmetry and it may get used some day.) > > >>> > > >>> Third, we set up the arch-specific vcpu device operation handlers to > > >>> call into the timer code for anything related to the > > >>> KVM_ARM_VCPU_TIMER_CTRL group. > > >>> > > >>> Fourth, we implement support for getting and setting the timer interrupt > > >>> numbers using the above defined ABI in the arch timer code. > > >>> > > >>> Fifth, we introduce error checking upon enabling the arch timer (which > > >>> is called when first running a VCPU) to check that all VCPUs are > > >>> configured to use the same PPI for the timer (as mandated by the > > >>> architecture) and that the virtual and physical timers are not > > >>> configured to use the same IRQ number. > > >>> > > >>> Signed-off-by: Christoffer Dall <cdall@xxxxxxxxxx> > > >>> --- > > >>> Documentation/virtual/kvm/devices/vcpu.txt | 25 +++++++ > > >>> arch/arm/include/uapi/asm/kvm.h | 8 +++ > > >>> arch/arm/kvm/guest.c | 9 +++ > > >>> arch/arm64/include/uapi/asm/kvm.h | 3 + > > >>> arch/arm64/kvm/guest.c | 9 +++ > > >>> include/kvm/arm_arch_timer.h | 4 ++ > > >>> virt/kvm/arm/arch_timer.c | 104 +++++++++++++++++++++++++++++ > > >>> 7 files changed, 162 insertions(+) > > >>> > > >>> diff --git a/Documentation/virtual/kvm/devices/vcpu.txt b/Documentation/virtual/kvm/devices/vcpu.txt > > >>> index 352af6e..013e3f1 100644 > > >>> --- a/Documentation/virtual/kvm/devices/vcpu.txt > > >>> +++ b/Documentation/virtual/kvm/devices/vcpu.txt > > >>> @@ -35,3 +35,28 @@ Returns: -ENODEV: PMUv3 not supported or GIC not initialized > > >>> Request the initialization of the PMUv3. If using the PMUv3 with an in-kernel > > >>> virtual GIC implementation, this must be done after initializing the in-kernel > > >>> irqchip. > > >>> + > > >>> + > > >>> +2. GROUP: KVM_ARM_VCPU_TIMER_CTRL > > >>> +Architectures: ARM,ARM64 > > >>> + > > >>> +2.1. ATTRIBUTE: KVM_ARM_VCPU_TIMER_IRQ_VTIMER > > >>> +2.2. ATTRIBUTE: KVM_ARM_VCPU_TIMER_IRQ_PTIMER > > >>> +Parameters: in kvm_device_attr.addr the address for the timer interrupt is a > > >>> + pointer to an int > > >>> +Returns: -EINVAL: Invalid timer interrupt number > > >>> + -EBUSY: One or more VCPUs has already run > > >>> + > > >>> +A value describing the architected timer interrupt number when connected to an > > >>> +in-kernel virtual GIC. These must be a PPI (16 <= intid < 32). If an > > >>> +attribute is not set, a default value is applied (see below). > > >>> + > > >>> +KVM_ARM_VCPU_TIMER_IRQ_VTIMER: The EL1 virtual timer intid (default: 27) > > >>> +KVM_ARM_VCPU_TIMER_IRQ_PTIMER: The EL1 physical timer intid (default: 30) > > >> > > >> uber nit: my reading of the code is that the default is set at VCPU > > >> creation time. So setting the attribute overrides the default, not that > > >> the default is applied if no attribute is set (i.e. there is always a > > >> valid value). > > >> > > > > > > uh, right, I don't see the distinction though, so not sure how to > > > correct the text. > > > > > > Would "Setting the attribute overrides the default values (see below)." > > > work instead? > > > > Looks good to me. > > > > [...] > > > > >> > > >> Another issue that just popped in my head: how do we ensure that we > > >> don't clash between the PMU and the timers? Yes, that's a foolish thing > > >> to do, but that's no different from avoiding the same interrupt on the > > >> timers... > > >> > > > Argh, good point. > > > > > > So actually, nothing *really bad* happens from using the same IRQ number > > > except that the VM gets confused. However, it's living life dangerously > > > because we use the HW bit for the vtimer, so we if ever start using it > > > for other timers or the PMU, then you could end up in a situation where > > > you unmap the phys mapping on behalf of another device, and the physical > > > interrupt could be left in an active state. > > > > > > On the other hand, the vtimer currently belongs only to VMs and we set > > > the required physical active state before entering the VM, so maybe it > > > doesn't matter. > > > > So far, we always assume that there is never more than a single source > > per interrupt. We'll end-up with weird behaviours because our line_level > > field is not an OR of the various inputs, but a direct assignment > > (device A and B raise the line, then A drops it -> B's interrupt is gone). > > > > I think that only the guest will be confused by it, but accepting it may > > be interpreted as "this should work correctly". Would documenting that > > it is a bad idea be enough? > > > > > Should we just forego these checks and let the user shoot itself in the > > > foot if he/she wants to? > > > > If the documentation is enough, why not. Otherwise, we need some form of > > allocator. Boring. ;-) > > > > Well, it's not that bad really (untested): > > diff --git a/include/kvm/arm_vgic.h b/include/kvm/arm_vgic.h > index 1541f5d..122f9d3 100644 > --- a/include/kvm/arm_vgic.h > +++ b/include/kvm/arm_vgic.h > @@ -121,6 +121,9 @@ struct vgic_irq { > u8 source; /* GICv2 SGIs only */ > u8 priority; > enum vgic_irq_config config; /* Level or edge */ > + > + void *owner; /* Opaque pointer to reserve an interrupt > + for in-kernel devices. */ > }; > > struct vgic_register_region; > diff --git a/virt/kvm/arm/vgic/vgic.c b/virt/kvm/arm/vgic/vgic.c > index 3d0979c..7561d2d 100644 > --- a/virt/kvm/arm/vgic/vgic.c > +++ b/virt/kvm/arm/vgic/vgic.c > @@ -429,6 +429,42 @@ int kvm_vgic_unmap_phys_irq(struct kvm_vcpu *vcpu, unsigned int virt_irq) > } > > /** > + * kvm_vgic_set_owner - Set the owner of an interrupt for a VM > + * > + * @kvm: Pointer to the VM structure > + * @intid: The virtual INTID identifying the interrupt (PPI or SPI) > + * @owner: Opaque pointer to the owner > + * > + * Returns 0 if intid is not already used by another in-kernel device and the > + * owner is set, otherwise returns an error code. > + * > + * We only set the owner for VCPU 0 for PPIs. > + */ > +int kvm_vgic_set_owner(struct kvm *kvm, unsigned int intid, void *owner) > +{ > + struct vgic_irq *irq; > + struct kvm_vcpu *vcpu = kvm_get_vcpu(kvm, 0); > + int ret = 0; > + > + if (!vgic_initialized(kvm)) > + return -EAGAIN; > + > + /* SGIs and LPIs cannot be wired up to any device */ > + if (!irq_is_ppi(intid) && !vgic_valid_spi(kvm, intid)) > + return -EINVAL; > + > + irq = vgic_get_irq(kvm, vcpu, intid); > + spin_lock(&irq->irq_lock); > + if (irq->owner && irq->owner != owner) > + ret = -EEXIST; > + else > + irq->owner = owner; > + spin_unlock(&irq->irq_lock); > + > + return ret; > +} > + > +/** > * vgic_prune_ap_list - Remove non-relevant interrupts from the list > * > * @vcpu: The VCPU pointer > > The problem is that it doesn't help that much, because userspace can > still change the level of an IRQ which is connected to an in-kernel > device, unless we also introduce checking the owner field in the > injection path. > > I don't see it blowing up the host with/without an allocator, so I'm > fine with documentation, but I can also include the above. > > Thoughts? > I think we lost track of this series because it got stuck here. I'm tempted to just apply the series with the comment and without the owner stuff posted above, and then we can fix it up later if it turns out there's a problem. Thanks, -Christoffer