On Thu, May 04, 2017 at 01:47:41PM +0200, Paolo Bonzini wrote: > > > On 03/05/2017 18:06, Andrew Jones wrote: > > Don't use request-less VCPU kicks when injecting IRQs, as a VCPU > > kick meant to trigger the interrupt injection could be sent while > > the VCPU is outside guest mode, which means no IPI is sent, and > > after it has called kvm_vgic_flush_hwstate(), meaning it won't see > > the updated GIC state until its next exit some time later for some > > other reason. The receiving VCPU only needs to check this request > > in VCPU RUN to handle it. By checking it, if it's pending, a > > memory barrier will be issued that ensures all state is visible. > > We still create a vcpu_req_irq_pending() function (which is a nop), > > though, in order to allow us to use the standard request checking > > pattern. > > I wonder if you aren't just papering over this race: > > /* > * If there are no virtual interrupts active or pending for this > * VCPU, then there is no work to do and we can bail out without > * taking any lock. There is a potential race with someone injecting > * interrupts to the VCPU, but it is a benign race as the VCPU will > * either observe the new interrupt before or after doing this check, > * and introducing additional synchronization mechanism doesn't change > * this. > */ > if (list_empty(&vcpu->arch.vgic_cpu.ap_list_head)) > return; > > spin_lock(&vcpu->arch.vgic_cpu.ap_list_lock); > vgic_flush_lr_state(vcpu); > spin_unlock(&vcpu->arch.vgic_cpu.ap_list_lock); > > not being so "benign" after all. :) Maybe you can remove the if (list_empty()), > and have kvm_arch_vcpu_ioctl_run do this instead: I don't see how removing this shortcut improves anything. You'd still have the same window where you could loose an interrupt right after the spin_unlock. I think the race that this comment discusses is indeed benign, but the overall guarantees that our vgic injection relies on is flawed and can be solved by either doing requests as Drew does here, or moving the vgic_flush inside a region that has both mode == IN_GUEST_MODE and interrupts disabled. Note that for other purposes I'm planning to move the flush functions inside the interrupts disabled region later anyhow. I don't see a problem with Drew's patch actually. Thanks, -Christoffer > > if (kvm_request_pending(vcpu)) { > if (kvm_check_request(KVM_REQ_SLEEP, vcpu)) > vcpu_req_sleep(vcpu); > } > > preempt_disable(); > > kvm_pmu_flush_hwstate(vcpu); > kvm_timer_flush_hwstate(vcpu); > > if (kvm_check_request(KVM_REQ_IRQ_PENDING, vcpu)) > kvm_vgic_flush_hwstate(vcpu); > > ? > > Paolo > > > Signed-off-by: Andrew Jones <drjones@xxxxxxxxxx> > > --- > > arch/arm/include/asm/kvm_host.h | 1 + > > arch/arm/kvm/arm.c | 12 ++++++++++++ > > arch/arm64/include/asm/kvm_host.h | 1 + > > virt/kvm/arm/arch_timer.c | 1 + > > virt/kvm/arm/vgic/vgic.c | 9 +++++++-- > > 5 files changed, 22 insertions(+), 2 deletions(-) > > > > diff --git a/arch/arm/include/asm/kvm_host.h b/arch/arm/include/asm/kvm_host.h > > index 41669578b3df..7bf90aaf2e87 100644 > > --- a/arch/arm/include/asm/kvm_host.h > > +++ b/arch/arm/include/asm/kvm_host.h > > @@ -47,6 +47,7 @@ > > > > #define KVM_REQ_SLEEP \ > > KVM_ARCH_REQ_FLAGS(0, KVM_REQUEST_NO_WAKEUP | KVM_REQUEST_WAIT) > > +#define KVM_REQ_IRQ_PENDING KVM_ARCH_REQ(1) > > > > u32 *kvm_vcpu_reg(struct kvm_vcpu *vcpu, u8 reg_num, u32 mode); > > int __attribute_const__ kvm_target_cpu(void); > > diff --git a/arch/arm/kvm/arm.c b/arch/arm/kvm/arm.c > > index d62e99885434..330064475914 100644 > > --- a/arch/arm/kvm/arm.c > > +++ b/arch/arm/kvm/arm.c > > @@ -581,6 +581,15 @@ static void vcpu_req_sleep(struct kvm_vcpu *vcpu) > > (!vcpu->arch.pause))); > > } > > > > +static void vcpu_req_irq_pending(struct kvm_vcpu *vcpu) > > +{ > > + /* > > + * Nothing to do here. kvm_check_request() already issued a memory > > + * barrier that pairs with kvm_make_request(), so all hardware state > > + * we need to flush should now be visible. > > + */ > > +} > > + > > static int kvm_vcpu_initialized(struct kvm_vcpu *vcpu) > > { > > return vcpu->arch.target >= 0; > > @@ -634,6 +643,8 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu, struct kvm_run *run) > > if (kvm_request_pending(vcpu)) { > > if (kvm_check_request(KVM_REQ_SLEEP, vcpu)) > > vcpu_req_sleep(vcpu); > > + if (kvm_check_request(KVM_REQ_IRQ_PENDING, vcpu)) > > + vcpu_req_irq_pending(vcpu); > > } > > > > /* > > @@ -777,6 +788,7 @@ static int vcpu_interrupt_line(struct kvm_vcpu *vcpu, int number, bool level) > > * trigger a world-switch round on the running physical CPU to set the > > * virtual IRQ/FIQ fields in the HCR appropriately. > > */ > > + kvm_make_request(KVM_REQ_IRQ_PENDING, vcpu); > > kvm_vcpu_kick(vcpu); > > > > return 0; > > diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h > > index 04c0f9d37386..2c33fef945fe 100644 > > --- a/arch/arm64/include/asm/kvm_host.h > > +++ b/arch/arm64/include/asm/kvm_host.h > > @@ -44,6 +44,7 @@ > > > > #define KVM_REQ_SLEEP \ > > KVM_ARCH_REQ_FLAGS(0, KVM_REQUEST_NO_WAKEUP | KVM_REQUEST_WAIT) > > +#define KVM_REQ_IRQ_PENDING KVM_ARCH_REQ(1) > > > > int __attribute_const__ kvm_target_cpu(void); > > int kvm_reset_vcpu(struct kvm_vcpu *vcpu); > > diff --git a/virt/kvm/arm/arch_timer.c b/virt/kvm/arm/arch_timer.c > > index 5976609ef27c..469b43315c0a 100644 > > --- a/virt/kvm/arm/arch_timer.c > > +++ b/virt/kvm/arm/arch_timer.c > > @@ -95,6 +95,7 @@ static void kvm_timer_inject_irq_work(struct work_struct *work) > > * If the vcpu is blocked we want to wake it up so that it will see > > * the timer has expired when entering the guest. > > */ > > + kvm_make_request(KVM_REQ_IRQ_PENDING, vcpu); > > kvm_vcpu_kick(vcpu); > > } > > > > diff --git a/virt/kvm/arm/vgic/vgic.c b/virt/kvm/arm/vgic/vgic.c > > index 3d0979c30721..bdd4b3a953b5 100644 > > --- a/virt/kvm/arm/vgic/vgic.c > > +++ b/virt/kvm/arm/vgic/vgic.c > > @@ -283,8 +283,10 @@ bool vgic_queue_irq_unlock(struct kvm *kvm, struct vgic_irq *irq) > > * won't see this one until it exits for some other > > * reason. > > */ > > - if (vcpu) > > + if (vcpu) { > > + kvm_make_request(KVM_REQ_IRQ_PENDING, vcpu); > > kvm_vcpu_kick(vcpu); > > + } > > return false; > > } > > > > @@ -330,6 +332,7 @@ bool vgic_queue_irq_unlock(struct kvm *kvm, struct vgic_irq *irq) > > spin_unlock(&irq->irq_lock); > > spin_unlock(&vcpu->arch.vgic_cpu.ap_list_lock); > > > > + kvm_make_request(KVM_REQ_IRQ_PENDING, vcpu); > > kvm_vcpu_kick(vcpu); > > > > return true; > > @@ -719,8 +722,10 @@ void vgic_kick_vcpus(struct kvm *kvm) > > * a good kick... > > */ > > kvm_for_each_vcpu(c, vcpu, kvm) { > > - if (kvm_vgic_vcpu_pending_irq(vcpu)) > > + if (kvm_vgic_vcpu_pending_irq(vcpu)) { > > + kvm_make_request(KVM_REQ_IRQ_PENDING, vcpu); > > kvm_vcpu_kick(vcpu); > > + } > > } > > } > > > >