On Wed, May 29, 2024 at 01:12:06PM +0100, Pierre-Clément Tosi wrote: > CONFIG_CFI_CLANG ("kernel Control Flow Integrity") makes the compiler inject > runtime type checks before any indirect function call. On AArch64, it generates > a BRK instruction to be executed on type mismatch and encodes the indices of the > registers holding the branch target and expected type in the immediate of the > instruction. As a result, a synchronous exception gets triggered on kCFI failure > and the fault handler can retrieve the immediate (and indices) from ESR_ELx. > > This feature has been supported at EL1 ("host") since it was introduced by > b26e484b8bb3 ("arm64: Add CFI error handling"), where cfi_handler() decodes > ESR_EL1, giving informative panic messages such as > > [ 21.885179] CFI failure at lkdtm_indirect_call+0x2c/0x44 [lkdtm] > (target: lkdtm_increment_int+0x0/0x1c [lkdtm]; expected type: 0x7e0c52a) > [ 21.886593] Internal error: Oops - CFI: 0 [#1] PREEMPT SMP > > However, it is not or only partially supported at EL2: in nVHE (or pKVM), > CONFIG_CFI_CLANG gets filtered out at build time, preventing the compiler from > injecting the checks. In VHE, EL2 code gets compiled with the checks but the > handlers in VBAR_EL2 are not aware of kCFI and will produce a generic and > not-so-helpful panic message such as > > [ 36.456088][ T200] Kernel panic - not syncing: HYP panic: > [ 36.456088][ T200] PS:204003c9 PC:ffffffc080092310 ESR:f2008228 > [ 36.456088][ T200] FAR:0000000081a50000 HPFAR:000000000081a500 PAR:1de7ec7edbadc0de > [ 36.456088][ T200] VCPU:00000000e189c7cf > > To address this, > > - [01/13] fixes an existing bug where the ELR_EL2 was getting clobbered on > synchronous exceptions, causing the wrong "PC" to be reported by > nvhe_hyp_panic_handler() or __hyp_call_panic(). This is particularly limiting > for kCFI, as it would mask the location of the failed type check. > - [02/13] fixes a minor C/asm ABI mismatch which would trigger a kCFI failure > - [03/13] to [09/13] prepare nVHE for CONFIG_CFI_CLANG and [10/13] enables it > - [11/13] improves kCFI error messages by saving then parsing the CPU context > - [12/13] adds a kCFI test module for VHE and [13/13] extends it to nVHE & pKVM > > As a result, an informative kCFI panic message is printed by or on behalf of EL2 > giving the expected type and target address (possibly resolved to a symbol) for > VHE, nVHE, and pKVM (iff CONFIG_NVHE_EL2_DEBUG=y). > > Note that kCFI errors remain fatal at EL2, even when CONFIG_CFI_PERMISSIVE=y. > > Changes in v4: > - Addressed Will's comments on v3: nit: but please keep reviewers on CC when you post a new version. I missed this initially. Will