On Fri, Jul 31, 2020 at 09:00:03AM +0100, Marc Zyngier wrote: > Hi Andrew, > > On 2020-07-30 23:31, Andrew Scull wrote: > > On Thu, Jul 30, 2020 at 04:18:23PM +0100, Andrew Scull wrote: > > > The ESB at the start of the vectors causes any SErrors to be > > > consumed to > > > DISR_EL1. If the exception came from the host and the ESB caught an > > > SError, it would not be noticed until a guest exits and DISR_EL1 is > > > checked. Further, the SError would be attributed to the guest and not > > > the host. > > > > > > To avoid these problems, use a different exception vector for the host > > > that does not use an ESB but instead leaves any host SError pending. A > > > guest will not be entered if an SError is pending so it will always be > > > the host that will receive and handle it. > > > > Thinking further, I'm not sure this actually solves all of the problem. > > It does prevent hyp from causing a host SError to be consumed but, IIUC, > > there could be an SError already deferred by the host and logged in > > DISR_EL1 that hyp would not preserve if a guest is run. > > > > I think the host's DISR_EL1 would need to be saved and restored in the > > vcpu context switch which, from a cursory read of the ARM, is possible > > without having to virtualize SErrors for the host. > > The question is what do you if you have something pending in DISR_EL1 > at the point where you enter EL2? Context switching it is not going to > help. One problem is that you'd need to do an ESB, corrupting DISR_EL1, > before any memory access (I'm assuming you can get traps where all > registers are live). I can't see how we square this circle. I'll expand on what I understand (or think I do) about RAS at the moment. It should hopefully highlight anything wrong with my reasoning for your questions. DISR_EL1.A being set means a pending physical SError has been consumed/cleared. The host has already deferred an SError so saving and restoring (i.e. preserving) DISR_EL1 for the host would mean the deferred SError is as it was on return to the host. If there is a pending physical SError, we'd have to keep it pending so the host can consume it. __guest_enter has the dsb-isb for RAS so SErrors will become pending, but it doesn't consume them. I can't remember now whether this was reliable or not; I assumed it was as it is gated on the RAS config. The above didn't need an ESB for the host but incorrect assumptions might change that. > Furthermore, assuming you find a way to do it, what do you do with it? > > (a) Either there was something pending already and it is still pending, If a physical SError is pending, you leave it pending and go back to the host so it can consume it. > (b) Or there was nothing pending and you now have an error that you > don't know how to report (the host EL1 never issued an ESB) If there isn't a physical SError pending, either there is no SError at all (easy) or the SError has already been consumed to DISR_EL1 by a host ESB and we'd preserve DISR_EL1 for the host to handle however it chooses. > We could just error out on hypercalls if DISR_EL1 is non-zero, but > I don't see how we do that for traps, as it would just confuse the host > EL1. Traps would need to be left pending. Detected but not consumed with the dsb-isb in __guest_enter. _______________________________________________ kvmarm mailing list kvmarm@xxxxxxxxxxxxxxxxxxxxx https://lists.cs.columbia.edu/mailman/listinfo/kvmarm