Re: [PATCH v3 7/8] arm64: exception: handle asynchronous SError interrupt

James Morse <james.morse@xxxxxxx> · Thu, 20 Apr 2017 09:52:59 +0100

Hi Wang Xiongfeng,

On 19/04/17 03:37, Xiongfeng Wang wrote:
> On 2017/4/18 18:51, James Morse wrote:
>> The host expects to receive physical SError Interrupts. The ARM-ARM doesn't
>> describe a way to inject these as they are generated by the CPU.
>>
>> Am I right in thinking you want this to use SError Interrupts as an APEI
>> notification? (This isn't a CPU thing so the RAS spec doesn't cover this use)
> 
> Yes, using sei as an APEI notification is one part of my consideration. Another use is for ESB.
> RAS spec 6.5.3 'Example software sequences: Variant: asynchronous External Abort with ESB'
> describes the SEI recovery process when ESB is implemented.
> 
> In this situation, SEI is routed to EL3 (SCR_EL3.EA = 1). When an SEI occurs in EL0 and not been taken immediately,
> and then an ESB instruction at SVC entry is executed, SEI is taken to EL3. The ESB at SVC entry is
> used for preventing the error propagating from user space to kernel space. The EL3 SEI handler collects

> the errors and fills in the APEI table, and then jump to EL2 SEI handler. EL2 SEI handler inject
> an vSEI into EL1 by setting HCR_EL2.VSE = 1, so that when returned to OS, an SEI is pending.

This step has confused me. How would this work with VHE where the host runs at
EL2 and there is nothing at Host:EL1?
>From your description I assume you have some firmware resident at EL2.

> Then ESB is executed again, and DISR_EL1.A is set by hardware (2.4.4 ESB and virtual errors), so that
> the following process can be executed.

> So we want to inject a vSEI into OS, when control is returned from EL3/2 to OS, no matter whether
> it is on host OS or guest OS.

I disagree. With Linux/KVM you can't use Virtual SError to notify the host OS.
The host OS expects to receive Physical SError, these shouldn't be taken to EL2
unless a guest is running. Notifications from firmware that use SEA or SEI
should follow the routing rules in the ARM-ARM, which means they should never
reach a guest OS.

For VHE the host runs at EL2 and sets HCR_EL2.AMO. Any firmware notification
should come from EL3 to the host at EL2. The host may choose to notify the
guest, but this should always go via Qemu.

For non-VHE systems the host runs at EL1 and KVM lives at EL2 to do the world
switch. When KVM is running a guest it sets HCR_EL2.AMO, when it has switched
back to the host it clears it.

Notifications from EL3 that pretend to be SError should route SError to EL2 or
EL1 depending on HCR_EL2.AMO.
When KVM gets a Synchronous External Abort or an SError while a guest is running
it will switch back to the host and tell the handle_exit() code.  Again the host
may choose to notify the guest, but this should always go via Qemu.

The ARM-ARM pseudo code for the routing rules is in
AArch64.TakePhysicalSErrorException(). Firmware injecting fake SError should
behave exactly like this.

Newly appeared in the ARM-ARM is HCR_EL2.TEA, which takes Synchronous External
Aborts to EL2 (if SCR_EL3 hasn't claimed them). We should set/clear this bit
like we do HCR_EL2.AMO if the CPU has the RAS extensions.

>> You cant use SError to cover all the possible RAS exceptions. We already have
>> this problem using SEI if PSTATE.A was set and the exception was an imprecise
>> abort from EL2. We can't return to the interrupted context and we can't deliver
>> an SError to EL2 either.
> 
> SEI came from EL2 and PSTATE.A is set. Is it the situation where VHE is enabled and CPU is running
> in kernel space. If SEI occurs in kernel space, can we just panic or shutdown.

firmware-panic? We can do a little better than that:
If the IESB bit is set in the ESR we can behave as if this were an ESB and have
firmware write an appropriate ESR to DISR_EL1 if PSTATE.A is set and the
exception came from EL2.

Linux should have an ESB in its __switch_to(), I've re-worked the daif masking
in entry.S so that PSTATE.A will always be unmasked here.

Other than these two cases, yes, this CPU really is wrecked. Firmware can power
it off via PSCI, and could notify another CPU about what happened. UEFI's table
250 'Processor Generic Error Section' has a 'Processor ID' field, with a note
that on ARM this is the MPIDR_EL1. Table 260 'ARM Processor Error Section' has
something similar.

Thanks,

James