On 27/10/15 06:13, Paul Mackerras wrote: > When handling a hypervisor data or instruction storage interrupt (HDSI > or HISI), we look up the SLB entry for the address being accessed in > order to translate the effective address to a virtual address which can > be looked up in the guest HPT. This lookup can occasionally fail due > to the guest replacing an SLB entry without invalidating the evicted > SLB entry. In this situation an ERAT (effective to real address > translation cache) entry can persist and be used by the hardware even > though there is no longer a corresponding SLB entry. > > Previously we would just deliver a data or instruction storage interrupt > (DSI or ISI) to the guest in this case. However, this is not correct > and has been observed to cause guests to crash, typically with a > data storage protection interrupt on a store to the vmemmap area. > > Instead, what we do now is to synthesize a data or instruction segment > interrupt. That should cause the guest to reload an appropriate entry > into the SLB and retry the faulting instruction. If it still faults, > we should find an appropriate SLB entry next time and be able to handle > the fault. > > Signed-off-by: Paul Mackerras <paulus@xxxxxxxxx> > --- > arch/powerpc/kvm/book3s_hv_rmhandlers.S | 20 ++++++++++++-------- > 1 file changed, 12 insertions(+), 8 deletions(-) > > diff --git a/arch/powerpc/kvm/book3s_hv_rmhandlers.S b/arch/powerpc/kvm/book3s_hv_rmhandlers.S > index b1dab8d..3c6badc 100644 > --- a/arch/powerpc/kvm/book3s_hv_rmhandlers.S > +++ b/arch/powerpc/kvm/book3s_hv_rmhandlers.S > @@ -1749,7 +1749,8 @@ kvmppc_hdsi: > beq 3f > clrrdi r0, r4, 28 > PPC_SLBFEE_DOT(R5, R0) /* if so, look up SLB */ > - bne 1f /* if no SLB entry found */ > + li r0, BOOK3S_INTERRUPT_DATA_SEGMENT > + bne 7f /* if no SLB entry found */ > 4: std r4, VCPU_FAULT_DAR(r9) > stw r6, VCPU_FAULT_DSISR(r9) > > @@ -1768,14 +1769,15 @@ kvmppc_hdsi: > cmpdi r3, -2 /* MMIO emulation; need instr word */ > beq 2f > > - /* Synthesize a DSI for the guest */ > + /* Synthesize a DSI (or DSegI) for the guest */ > ld r4, VCPU_FAULT_DAR(r9) > mr r6, r3 > -1: mtspr SPRN_DAR, r4 > +1: li r0, BOOK3S_INTERRUPT_DATA_STORAGE > mtspr SPRN_DSISR, r6 > +7: mtspr SPRN_DAR, r4 I first though "hey, you're not setting DSISR for the DsegI now" ... but then I saw in the PowerISA that DSISR is "undefined" for the DSegI, so this should be fine, I think. > mtspr SPRN_SRR0, r10 > mtspr SPRN_SRR1, r11 > - li r10, BOOK3S_INTERRUPT_DATA_STORAGE > + mr r10, r0 > bl kvmppc_msr_interrupt > fast_interrupt_c_return: > 6: ld r7, VCPU_CTR(r9) > @@ -1823,7 +1825,8 @@ kvmppc_hisi: > beq 3f > clrrdi r0, r10, 28 > PPC_SLBFEE_DOT(R5, R0) /* if so, look up SLB */ > - bne 1f /* if no SLB entry found */ > + li r0, BOOK3S_INTERRUPT_INST_SEGMENT > + bne 7f /* if no SLB entry found */ > 4: > /* Search the hash table. */ > mr r3, r9 /* vcpu pointer */ > @@ -1840,11 +1843,12 @@ kvmppc_hisi: > cmpdi r3, -1 /* handle in kernel mode */ > beq guest_exit_cont > > - /* Synthesize an ISI for the guest */ > + /* Synthesize an ISI (or ISegI) for the guest */ > mr r11, r3 > -1: mtspr SPRN_SRR0, r10 > +1: li r0, BOOK3S_INTERRUPT_INST_STORAGE > +7: mtspr SPRN_SRR0, r10 > mtspr SPRN_SRR1, r11 > - li r10, BOOK3S_INTERRUPT_INST_STORAGE > + mr r10, r0 > bl kvmppc_msr_interrupt > b fast_interrupt_c_return As far as I can see (but I'm really not an expert here), the patch seems to be fine. And we have a test running with it for almost 6 days now and did not see any further guest crashes, so it seems to me like this is the right fix for this issue! So if you like, you can add my "Reviewed-by" or "Tested-by" to this patch. Thanks a lot for fixing this! Thomas -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html