Re: [PATCH] KVM: LAPIC: Fix pending interrupt in IRR blocked by software disable LAPIC

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 02/07/19 11:25, Wanpeng Li wrote:
> From: Wanpeng Li <wanpengli@xxxxxxxxxxx>
> 
> Thomas reported that:
> 
>  | Background:
>  | 
>  |    In preparation of supporting IPI shorthands I changed the CPU offline
>  |    code to software disable the local APIC instead of just masking it.
>  |    That's done by clearing the APIC_SPIV_APIC_ENABLED bit in the APIC_SPIV
>  |    register.
>  | 
>  | Failure:
>  | 
>  |    When the CPU comes back online the startup code triggers occasionally
>  |    the warning in apic_pending_intr_clear(). That complains that the IRRs
>  |    are not empty.
>  | 
>  |    The offending vector is the local APIC timer vector who's IRR bit is set
>  |    and stays set.
>  | 
>  | It took me quite some time to reproduce the issue locally, but now I can
>  | see what happens.
>  | 
>  | It requires apicv_enabled=0, i.e. full apic emulation. With apicv_enabled=1
>  | (and hardware support) it behaves correctly.
>  | 
>  | Here is the series of events:
>  | 
>  |     Guest CPU
>  | 
>  |     goes down
>  | 
>  |       native_cpu_disable()		
>  | 
>  | 			apic_soft_disable();
>  | 
>  |     play_dead()
>  | 
>  |     ....
>  | 
>  |     startup()
>  | 
>  |       if (apic_enabled())
>  |         apic_pending_intr_clear()	<- Not taken
>  | 
>  |      enable APIC
>  | 
>  |         apic_pending_intr_clear()	<- Triggers warning because IRR is stale
>  | 
>  | When this happens then the deadline timer or the regular APIC timer -
>  | happens with both, has fired shortly before the APIC is disabled, but the
>  | interrupt was not serviced because the guest CPU was in an interrupt
>  | disabled region at that point.
>  | 
>  | The state of the timer vector ISR/IRR bits:
>  | 
>  |     	     	       	        ISR     IRR
>  | before apic_soft_disable()    0	      1
>  | after apic_soft_disable()     0	      1
>  | 
>  | On startup		      		 0	      1
>  | 
>  | Now one would assume that the IRR is cleared after the INIT reset, but this
>  | happens only on CPU0.
>  | 
>  | Why?
>  | 
>  | Because our CPU0 hotplug is just for testing to make sure nothing breaks
>  | and goes through an NMI wakeup vehicle because INIT would send it through
>  | the boots-trap code which is not really working if that CPU was not
>  | physically unplugged.
>  | 
>  | Now looking at a real world APIC the situation in that case is:
>  | 
>  |     	     	       	      	ISR     IRR
>  | before apic_soft_disable()    0	      1
>  | after apic_soft_disable()     0	      1
>  | 
>  | On startup		      		 0	      0
>  | 
>  | Why?
>  | 
>  | Once the dying CPU reenables interrupts the pending interrupt gets
>  | delivered as a spurious interupt and then the state is clear.
>  | 
>  | While that CPU0 hotplug test case is surely an esoteric issue, the APIC
>  | emulation is still wrong, Even if the play_dead() code would not enable
>  | interrupts then the pending IRR bit would turn into an ISR .. interrupt
>  | when the APIC is reenabled on startup.
> 
> 
> From SDM 10.4.7.2 Local APIC State After It Has Been Software Disabled
> * Pending interrupts in the IRR and ISR registers are held and require
>   masking or handling by the CPU.
> 
> In Thomas's testing, hardware cpu will not respect soft disable LAPIC 
> when IRR has already been set or APICv posted-interrupt is in flight, 
> so we can skip soft disable APIC checking when clearing IRR and set ISR,
> continue to respect soft disable APIC when attempting to set IRR.
> 
> Reported-by: Rong Chen <rong.a.chen@xxxxxxxxx>
> Reported-by: Feng Tang <feng.tang@xxxxxxxxx>
> Reported-by: Thomas Gleixner <tglx@xxxxxxxxxxxxx>
> Tested-by: Thomas Gleixner <tglx@xxxxxxxxxxxxx>
> Cc: Paolo Bonzini <pbonzini@xxxxxxxxxx>
> Cc: Radim Krčmář <rkrcmar@xxxxxxxxxx>
> Cc: Thomas Gleixner <tglx@xxxxxxxxxxxxx>
> Cc: Rong Chen <rong.a.chen@xxxxxxxxx>
> Cc: Feng Tang <feng.tang@xxxxxxxxx>
> Cc: stable@xxxxxxxxxxxxxxx
> Signed-off-by: Wanpeng Li <wanpengli@xxxxxxxxxxx>
> ---
>  arch/x86/kvm/lapic.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
> index 05d8934..f857a12 100644
> --- a/arch/x86/kvm/lapic.c
> +++ b/arch/x86/kvm/lapic.c
> @@ -2376,7 +2376,7 @@ int kvm_apic_has_interrupt(struct kvm_vcpu *vcpu)
>  	struct kvm_lapic *apic = vcpu->arch.apic;
>  	u32 ppr;
>  
> -	if (!apic_enabled(apic))
> +	if (!kvm_apic_hw_enabled(apic))
>  		return -1;
>  
>  	__apic_update_ppr(apic, &ppr);
> 

Queued, thanks.

Paolo



[Index of Archives]     [Linux Kernel]     [Kernel Development Newbies]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite Hiking]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux