Re: [PATCH 2/2] KVM: arm64: nVHE: Don't consume host SErrors with RAS

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Jul 31, 2020 at 09:00:03AM +0100, Marc Zyngier wrote:
> Hi Andrew,
> 
> On 2020-07-30 23:31, Andrew Scull wrote:
> > On Thu, Jul 30, 2020 at 04:18:23PM +0100, Andrew Scull wrote:
> > > The ESB at the start of the vectors causes any SErrors to be
> > > consumed to
> > > DISR_EL1. If the exception came from the host and the ESB caught an
> > > SError, it would not be noticed until a guest exits and DISR_EL1 is
> > > checked. Further, the SError would be attributed to the guest and not
> > > the host.
> > > 
> > > To avoid these problems, use a different exception vector for the host
> > > that does not use an ESB but instead leaves any host SError pending. A
> > > guest will not be entered if an SError is pending so it will always be
> > > the host that will receive and handle it.
> > 
> > Thinking further, I'm not sure this actually solves all of the problem.
> > It does prevent hyp from causing a host SError to be consumed but, IIUC,
> > there could be an SError already deferred by the host and logged in
> > DISR_EL1 that hyp would not preserve if a guest is run.
> > 
> > I think the host's DISR_EL1 would need to be saved and restored in the
> > vcpu context switch which, from a cursory read of the ARM, is possible
> > without having to virtualize SErrors for the host.
> 
> The question is what do you if you have something pending in DISR_EL1
> at the point where you enter EL2? Context switching it is not going to
> help. One problem is that you'd need to do an ESB, corrupting DISR_EL1,
> before any memory access (I'm assuming you can get traps where all
> registers are live). I can't see how we square this circle.

I'll expand on what I understand (or think I do) about RAS at the
moment. It should hopefully highlight anything wrong with my reasoning
for your questions.

DISR_EL1.A being set means a pending physical SError has been
consumed/cleared. The host has already deferred an SError so saving and
restoring (i.e. preserving) DISR_EL1 for the host would mean the
deferred SError is as it was on return to the host.

If there is a pending physical SError, we'd have to keep it pending so
the host can consume it. __guest_enter has the dsb-isb for RAS so
SErrors will become pending, but it doesn't consume them. I can't
remember now whether this was reliable or not; I assumed it was as it is
gated on the RAS config.

The above didn't need an ESB for the host but incorrect assumptions
might change that.

> Furthermore, assuming you find a way to do it, what do you do with it?
> 
> (a) Either there was something pending already and it is still pending,

If a physical SError is pending, you leave it pending and go back to the
host so it can consume it.

> (b) Or there was nothing pending and you now have an error that you
>     don't know how to report (the host EL1 never issued an ESB)

If there isn't a physical SError pending, either there is no SError at
all (easy) or the SError has already been consumed to DISR_EL1 by a host
ESB and we'd preserve DISR_EL1 for the host to handle however it chooses.

> We could just error out on hypercalls if DISR_EL1 is non-zero, but
> I don't see how we do that for traps, as it would just confuse the host
> EL1.

Traps would need to be left pending. Detected but not consumed with the
dsb-isb in __guest_enter.
_______________________________________________
kvmarm mailing list
kvmarm@xxxxxxxxxxxxxxxxxxxxx
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm



[Index of Archives]     [Linux KVM]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux