On Mon, Jul 18, 2022 at 01:41:37PM +0200, Peter Zijlstra wrote: > On Fri, Jul 15, 2022 at 04:45:50PM -0300, Thadeu Lima de Souza Cascardo wrote: > > When running with return thunks enabled under 32-bit EFI, the system > > crashes with: > > > > [ 0.137688] kernel tried to execute NX-protected page - exploit attempt? (uid: 0) > > [ 0.138136] BUG: unable to handle page fault for address: 000000005bc02900 > > [ 0.138136] #PF: supervisor instruction fetch in kernel mode > > [ 0.138136] #PF: error_code(0x0011) - permissions violation > > [ 0.138136] PGD 18f7063 P4D 18f7063 PUD 18ff063 PMD 190e063 PTE 800000005bc02063 > > [ 0.138136] Oops: 0011 [#1] PREEMPT SMP PTI > > [ 0.138136] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.19.0-rc6+ #166 > > [ 0.138136] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015 > > [ 0.138136] RIP: 0010:0x5bc02900 > > [ 0.138136] Code: Unable to access opcode bytes at RIP 0x5bc028d6. > > [ 0.138136] RSP: 0018:ffffffffb3203e10 EFLAGS: 00010046 > > [ 0.138136] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000048 > > [ 0.138136] RDX: 000000000190dfac RSI: 0000000000001710 RDI: 000000007eae823b > > [ 0.138136] RBP: ffffffffb3203e70 R08: 0000000001970000 R09: ffffffffb3203e28 > > [ 0.138136] R10: 747563657865206c R11: 6c6977203a696665 R12: 0000000000001710 > > [ 0.138136] R13: 0000000000000030 R14: 0000000001970000 R15: 0000000000000001 > > [ 0.138136] FS: 0000000000000000(0000) GS:ffff8e013ca00000(0000) knlGS:0000000000000000 > > [ 0.138136] CS: 0010 DS: 0018 ES: 0018 CR0: 0000000080050033 > > [ 0.138136] CR2: 000000005bc02900 CR3: 0000000001930000 CR4: 00000000000006f0 > > [ 0.138136] Call Trace: > > [ 0.138136] <TASK> > > [ 0.138136] ? efi_set_virtual_address_map+0x9c/0x175 > > [ 0.138136] efi_enter_virtual_mode+0x4a6/0x53e > > [ 0.138136] start_kernel+0x67c/0x71e > > [ 0.138136] x86_64_start_reservations+0x24/0x2a > > [ 0.138136] x86_64_start_kernel+0xe9/0xf4 > > [ 0.138136] secondary_startup_64_no_verify+0xe5/0xeb > > [ 0.138136] </TASK> > > > > That's because it cannot jump to the return thunk from the 32-bit code. > > Using a naked RET and marking it as safe allows the system to proceed > > booting. > > > > Fixes: aa3d480315ba ("x86: Use return-thunk in asm code") > > Reported-by: Guenter Roeck <linux@xxxxxxxxxxxx> > > Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@xxxxxxxxxxxxx> > > Cc: Peter Zijlstra (Intel) <peterz@xxxxxxxxxxxxx> > > Cc: Borislav Petkov <bp@xxxxxxx> > > Cc: Josh Poimboeuf <jpoimboe@xxxxxxxxxx> > > Cc: <stable@xxxxxxxxxxxxxxx> > > --- > > > > Does this leave one potential attack vector open? Perhaps, since this is > > running under a different mapping (AFAIU), the risk is reduced? Or rather, the > > attacker could attack using the firmware RETs anyway? > > > > Alternatively, we could use IBPB when available when using the wrapper. > > > > Thoughts? > > What actual uarch are you running this on? Is this AMD hardware? > > For Intel we'll enable IBRS for firmware if it is not otherwise enabled > (upstream will always enable IBRS for the SKL family chips, but Thomas > just posted the retbleed=stuff approach yesterday that will not) > > On AMD I think you're stuck with IBPB, but that has to be issued > *before* calling the firmware muck. > > In either case, I think the patch as proposed is fine. But perhaps we > want something like the below on top. > I was testing this on Intel, because Guenter Roeck had reported such failures when booting with 32-bit EFI. My patch fixes the boot problem, but then I asked whether it would leave us vulnerable and, then, I was thinking about AMD mostly, as you pointed out. And I think you nailed what I had in mind for using IBPB when doing firmware calls, and perhaps this is wanted even when we ignore this naked RET here. There is a typo on your patch below, but I will give it a try and see if it doesn't blow up on AMD systems without IBPB (by way of emulation). Thanks. Cascardo. > --- > Subject: x86/amd: Use IBPB for firmware calls > > On AMD IBRS does not prevent Retbleed; as such use IBPB before a > firmware call to flush the branch history state. > > Signed-off-by: Peter Zijlstra (Intel) <peterz@xxxxxxxxxxxxx> > --- > arch/x86/include/asm/cpufeatures.h | 1 + > arch/x86/include/asm/nospec-branch.h | 2 ++ > arch/x86/kernel/cpu/bugs.c | 11 ++++++++++- > 3 files changed, 13 insertions(+), 1 deletion(-) > > diff --git a/arch/x86/include/asm/cpufeatures.h b/arch/x86/include/asm/cpufeatures.h > index 00f5227c8459..a77b915d36a8 100644 > --- a/arch/x86/include/asm/cpufeatures.h > +++ b/arch/x86/include/asm/cpufeatures.h > @@ -302,6 +302,7 @@ > #define X86_FEATURE_RETPOLINE_LFENCE (11*32+13) /* "" Use LFENCE for Spectre variant 2 */ > #define X86_FEATURE_RETHUNK (11*32+14) /* "" Use REturn THUNK */ > #define X86_FEATURE_UNRET (11*32+15) /* "" AMD BTB untrain return */ > +#define X86_FEATURE_USE_IBPB_FW (11*32+16) /* "" Use IBPB during runtime firmware calls */ > > /* Intel-defined CPU features, CPUID level 0x00000007:1 (EAX), word 12 */ > #define X86_FEATURE_AVX_VNNI (12*32+ 4) /* AVX VNNI instructions */ > diff --git a/arch/x86/include/asm/nospec-branch.h b/arch/x86/include/asm/nospec-branch.h > index 10a3bfc1eb23..f934dcdb7c0d 100644 > --- a/arch/x86/include/asm/nospec-branch.h > +++ b/arch/x86/include/asm/nospec-branch.h > @@ -297,6 +297,8 @@ do { \ > alternative_msr_write(MSR_IA32_SPEC_CTRL, \ > spec_ctrl_current() | SPEC_CTRL_IBRS, \ > X86_FEATURE_USE_IBRS_FW); \ > + altnerative_msr_write(MSR_IA32_PRED_CMD, PRED_CMD_IBPB, \ > + X86_FEATURE_USE_IBPB_FW); \ > } while (0) > > #define firmware_restrict_branch_speculation_end() \ > diff --git a/arch/x86/kernel/cpu/bugs.c b/arch/x86/kernel/cpu/bugs.c > index aa34f908c39f..78c9082242a9 100644 > --- a/arch/x86/kernel/cpu/bugs.c > +++ b/arch/x86/kernel/cpu/bugs.c > @@ -1516,7 +1516,16 @@ static void __init spectre_v2_select_mitigation(void) > * the CPU supports Enhanced IBRS, kernel might un-intentionally not > * enable IBRS around firmware calls. > */ > - if (boot_cpu_has(X86_FEATURE_IBRS) && !spectre_v2_in_ibrs_mode(mode)) { > + if (boot_cpu_has_bug(X86_BUG_RETBLEED) && > + (boot_cpu_data.x86_vendor == X86_VENDOR_AMD || > + boot_cpu_data.x86_vendor == X86_VENDOR_HYGON)) { > + > + if (retbleed_cmd != RETBLEED_CMD_IBPB) { > + setup_force_cpu_cap(X86_FEATURE_USE_IBPB_FW); > + pr_info("Enabling Speculation Barrier for firmware calls\n"); > + } > + > + } else if (boot_cpu_has(X86_FEATURE_IBRS) && !spectre_v2_in_ibrs_mode(mode)) { > setup_force_cpu_cap(X86_FEATURE_USE_IBRS_FW); > pr_info("Enabling Restricted Speculation for firmware calls\n"); > }