booke64 was not maintaing consistent lazy ee state when exiting the guest, leading to warnings and worse. booke32 was less affected due to the absence of lazy ee, but it was still feeding bad information into trace_hardirqs_off/on -- we don't want guest execution to be seen as an "IRQs off" interval. book3s_pr also has this problem. book3s_pr and booke both used kvmppc_lazy_ee_enable() without hard-disabling EE first, which could lead to races when irq_happened is cleared, or if an interrupt happens after kvmppc_lazy_ee_enable(), and possibly other issues. Now, on book3s_pr and booke, always hard-disable interrupts before kvmppc_prepare_to_enter(), but leave them soft-enabled. On book3s, this should results in the right lazy EE state when the asm code hard-enables on an exit. On booke, we call hard_irq_disable() rather than hard-enable immediately. Signed-off-by: Scott Wood <scottwood@xxxxxxxxxxxxx> Cc: Mihai Caraman <mihai.caraman@xxxxxxxxxxxxx> Cc: Benjamin Herrenschmidt <benh@xxxxxxxxxxxxxxxxxxx> Cc: Tiejun Chen <tiejun.chen@xxxxxxxxxxxxx> --- Only tested on booke (32 and 64 bit). Testers of book3s_pr would be appreciated (particularly with lockdep enabled). --- arch/powerpc/include/asm/kvm_ppc.h | 7 +++++++ arch/powerpc/kvm/book3s_pr.c | 6 ++++-- arch/powerpc/kvm/booke.c | 12 ++++++++++-- 3 files changed, 21 insertions(+), 4 deletions(-) diff --git a/arch/powerpc/include/asm/kvm_ppc.h b/arch/powerpc/include/asm/kvm_ppc.h index a5287fe..e55d7e5 100644 --- a/arch/powerpc/include/asm/kvm_ppc.h +++ b/arch/powerpc/include/asm/kvm_ppc.h @@ -399,6 +399,13 @@ static inline void kvmppc_mmu_flush_icache(pfn_t pfn) static inline void kvmppc_lazy_ee_enable(void) { #ifdef CONFIG_PPC64 + /* + * To avoid races, the caller must have gone directly from having + * interrupts fully-enabled to hard-disabled. + */ + WARN_ON(local_paca->irq_happened != PACA_IRQ_HARD_DIS); + trace_hardirqs_on(); + /* Only need to enable IRQs by hard enabling them after this */ local_paca->irq_happened = 0; local_paca->soft_enabled = 1; diff --git a/arch/powerpc/kvm/book3s_pr.c b/arch/powerpc/kvm/book3s_pr.c index d09baf1..a1e70113 100644 --- a/arch/powerpc/kvm/book3s_pr.c +++ b/arch/powerpc/kvm/book3s_pr.c @@ -884,7 +884,8 @@ program_interrupt: * and if we really did time things so badly, then we just exit * again due to a host external interrupt. */ - local_irq_disable(); + hard_irq_disable(); + trace_hardirqs_off(); s = kvmppc_prepare_to_enter(vcpu); if (s <= 0) { local_irq_enable(); @@ -1121,7 +1122,8 @@ int kvmppc_vcpu_run(struct kvm_run *kvm_run, struct kvm_vcpu *vcpu) * really did time things so badly, then we just exit again due to * a host external interrupt. */ - local_irq_disable(); + hard_irq_disable(); + trace_hardirqs_off(); ret = kvmppc_prepare_to_enter(vcpu); if (ret <= 0) { local_irq_enable(); diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c index ecbe908..5dc1f53 100644 --- a/arch/powerpc/kvm/booke.c +++ b/arch/powerpc/kvm/booke.c @@ -666,7 +666,8 @@ int kvmppc_vcpu_run(struct kvm_run *kvm_run, struct kvm_vcpu *vcpu) return -EINVAL; } - local_irq_disable(); + hard_irq_disable(); + trace_hardirqs_off(); s = kvmppc_prepare_to_enter(vcpu); if (s <= 0) { local_irq_enable(); @@ -834,6 +835,12 @@ int kvmppc_handle_exit(struct kvm_run *run, struct kvm_vcpu *vcpu, int s; int idx; +#ifdef CONFIG_PPC64 + WARN_ON(local_paca->irq_happened != 0); +#endif + hard_irq_disable(); + trace_hardirqs_off(); + /* update before a new last_exit_type is rewritten */ kvmppc_update_timing_stats(vcpu); @@ -1150,7 +1157,8 @@ int kvmppc_handle_exit(struct kvm_run *run, struct kvm_vcpu *vcpu, * aren't already exiting to userspace for some other reason. */ if (!(r & RESUME_HOST)) { - local_irq_disable(); + hard_irq_disable(); + trace_hardirqs_off(); s = kvmppc_prepare_to_enter(vcpu); if (s <= 0) { local_irq_enable(); -- 1.7.10.4 -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html