Hi Sami,
(CC: +Marc)
On 19/02/2020 00:08, Sami Tolvanen wrote:
> This patch series adds support for Clang's Shadow Call Stack
> (SCS) mitigation, which uses a separately allocated shadow stack
> to protect against return address overwrites.
I took this for a spin on some real hardware. cpu-idle, kexec
hibernate etc all work
great... but starting a KVM guest causes the CPU to get stuck in EL2.
With CONFIG_SHADOW_CALL_STACK disabled, this doesn't happen ... so its
something about the
feature being enabled.
I'm using clang-9 from debian bullseye/sid. (I tried to build tip of
tree ... that doesn't
go so well on arm64)
KVM takes an instruction abort from EL2 to EL2, because some of the
code it runs is not
mapped at EL2:
| ffffa00011588308 <__kvm_tlb_flush_local_vmid>:
| ffffa00011588308: d10103ff sub sp, sp, #0x40
| ffffa0001158830c: f90013f3 str x19, [sp, #32]
| ffffa00011588310: a9037bfd stp x29, x30, [sp, #48]
| ffffa00011588314: 9100c3fd add x29, sp, #0x30
| ffffa00011588318: 97ae18bf bl ffffa0001010e614
<__kern_hyp_va>
INSTRUCTION ABORT!
| ffffa0001158831c: f9400000 ldr x0, [x0]
| ffffa00011588320: 97ae18bd bl ffffa0001010e614
<__kern_hyp_va>
| ffffa00011588324: aa0003f3 mov x19, x0
| ffffa00011588328: 97ae18c1 bl ffffa0001010e62c
<has_vhe>
__kern_hyp_va() is static-inline which is patched wherever it appears
at boot with the EL2
ASLR values, it converts a kernel linear-map address to its EL2 KVM
alias:
| ffffa0001010dc5c <__kern_hyp_va>:
| ffffa0001010dc5c: 92400000 and x0, x0, #0x1
| ffffa0001010dc60: 93c00400 ror x0, x0, #1
| ffffa0001010dc64: 91000000 add x0, x0, #0x0
| ffffa0001010dc68: 91400000 add x0, x0, #0x0, lsl
#12
| ffffa0001010dc6c: 93c0fc00 ror x0, x0, #63
| ffffa0001010dc70: d65f03c0 ret
The problem here is where __kern_hyp_va() is. Its outside the
__hyp_text section:
| morse@eglon:~/kernel/linux-pigs$ nm -s vmlinux | grep hyp_text
| ffffa0001158b800 T __hyp_text_end
| ffffa000115838a0 T __hyp_text_start
If I disable CONFIG_SHADOW_CALL_STACK in Kconfig, I get:
| ffffa00011527fe0 <__kvm_tlb_flush_local_vmid>:
| ffffa00011527fe0: d100c3ff sub sp, sp, #0x30
| ffffa00011527fe4: a9027bfd stp x29, x30, [sp, #32]
| ffffa00011527fe8: 910083fd add x29, sp, #0x20
| ffffa00011527fec: 92400000 and x0, x0, #0x1
| ffffa00011527ff0: 93c00400 ror x0, x0, #1
| ffffa00011527ff4: 91000000 add x0, x0, #0x0
| ffffa00011527ff8: 91400000 add x0, x0, #0x0, lsl
#12
| ffffa00011527ffc: 93c0fc00 ror x0, x0, #63
| ffffa00011528000: f9400000 ldr x0, [x0]
| ffffa00011528004: 910023e1 add x1, sp, #0x8
| ffffa00011528008: 92400000 and x0, x0, #0x1
| ffffa0001152800c: 93c00400 ror x0, x0, #1
| ffffa00011528010: 91000000 add x0, x0, #0x0
| ffffa00011528014: 91400000 add x0, x0, #0x0, lsl
#12
| ffffa00011528018: 93c0fc00 ror x0, x0, #63
| ffffa0001152801c: 97ffff78 bl ffffa00011527dfc
<__tlb_switch_>
| ffffa00011528020: d508871f tlbi vmalle1
| ffffa00011528024: d503201f nop
This looks like reserving x18 is causing Clang to not-inline the
__kern_hyp_va() calls,
losing the vitally important section information. (I can see why the
compiler thinks this
is fair)
Is this a known, er, thing, with clang-9?
From eyeballing the disassembly __always_inline on __kern_hyp_va() is
enough of a hint to
stop this, ... with this configuration of clang-9. But KVM still
doesn't work, so it isn't
the only inlining decision KVM relies on that is changed by SCS.
I suspect repainting all KVM's 'inline' with __always_inline will fix
it. (yuck!) I'll try
tomorrow.