Re: v5.15.57 regression - boot panic after retbleed backports with CONFIG_KPROBES_SANITY_TEST=y

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Aug 05, 2022 at 04:04:38PM -0400, Paul Gortmaker wrote:
> The panic comes from the sanity test code, but after trying to boil down the
> .config differences between the kitchen sink our test team uses, and a
> "defconfig", it seems there are at least a couple extra dependencies for
> creating a reproducer:
> 
>   make defconfig
>   echo CONFIG_FUNCTION_TRACER=y >> .config
>   echo CONFIG_KPROBES_SANITY_TEST=y >> .config
>   echo CONFIG_UNWINDER_FRAME_POINTER=y >> .config
>   yes "" | make oldconfig
> 
> Note that ftrace is probably just opening the door to CONFIG_KPROBES_ON_FTRACE=y
> 
> The report I got was with gcc-11 on an Atom; I was able to reproduce it
> with the default gcc-7 found on Ubuntu 18.04 and booting on a Xeon v2 -
> so it seems to not be specific to gcc options or processor features.
> 
> I don't know if the v5.15 backports were specifically tested to be fully
> bisectable, but if we assume they are, a bisect between 56 and 57 says:
> 
>    commit 1d61a2988612ac0632134454d5407c63ae0b9d42 (refs/bisect/bad)
>    Author: Peter Zijlstra <peterz@xxxxxxxxxxxxx>
>    Date:   Tue Jun 14 23:15:45 2022 +0200
>    
>        x86: Use return-thunk in asm code
>        
>        commit aa3d480315ba6c3025a60958e1981072ea37c3df upstream.
>        
>        Use the return thunk in asm code. If the thunk isn't needed, it will
>        get patched into a RET instruction during boot by apply_returns().
> 
> Splat follows:
> 
>    rcu: Hierarchical SRCU implementation.
>    Kprobe smoke test: started
>    BUG: unable to handle page fault for address: ffffffffc110f3e7
>    #PF: supervisor instruction fetch in kernel mode
>    #PF: error_code(0x0010) - not-present page
>    PGD b2c60f067 P4D b2c60f067 PUD b2c611067 PMD 0
>    Oops: 0010 [#1] SMP NOPTI
>    CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.15.57 #33
>    Hardware name: Intel Corporation S2600CP/S2600CP, BIOS SE5C600.86B.02.06.E006.013120181511 01/31/2018
>    RIP: 0010:0xffffffffc110f3e7
>    Code: Unable to access opcode bytes at RIP 0xffffffffc110f3bd.
>    RSP: 0000:ffffae4bc006be38 EFLAGS: 00010246
>    RAX: ffffffffb973f310 RBX: 0000000000000000 RCX: 0000000000000000
>    RDX: 0000000000000000 RSI: 0000000000000000 RDI: 000000005856e7bd
>    RBP: ffffae4bc006be60 R08: 0000000000000000 R09: 0000000000000001
>    R10: 0000000000000001 R11: 0000000000000000 R12: 0000000000000001
>    R13: ffffffffbae38560 R14: 0000000000000000 R15: 0000000000000000
>    FS:  0000000000000000(0000) GS:ffff8c92df800000(0000) knlGS:0000000000000000
>    CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>    CR2: ffffffffc110f3bd CR3: 0000000b2c60c001 CR4: 00000000001706f0
>    Call Trace:
>     <TASK>
>     ? kprobe_target+0x5/0x20
>     ? init_test_probes+0x78/0x420
>     init_kprobes+0x16c/0x18e
>     ? init_optprobes+0x27/0x27
>     do_one_initcall+0x43/0x1d0
>     kernel_init_freeable+0xf1/0x240
>     ? rest_init+0xd0/0xd0
>     kernel_init+0x1a/0x120
>     ret_from_fork+0x1f/0x30
>     </TASK>
>    Modules linked in:
>    CR2: ffffffffc110f3e7
>    ---[ end trace 759f040622219261 ]---

Can you try the patch below?

Thanks.
Cascardo.

diff --git a/arch/x86/kernel/ftrace.c b/arch/x86/kernel/ftrace.c
index 74c2f88a43d0..6bb479ce1ae4 100644
--- a/arch/x86/kernel/ftrace.c
+++ b/arch/x86/kernel/ftrace.c
@@ -321,12 +321,12 @@ create_trampoline(struct ftrace_ops *ops, unsigned int *tramp_size)
 	unsigned long offset;
 	unsigned long npages;
 	unsigned long size;
-	unsigned long retq;
 	unsigned long *ptr;
 	void *trampoline;
 	void *ip;
 	/* 48 8b 15 <offset> is movq <offset>(%rip), %rdx */
 	unsigned const char op_ref[] = { 0x48, 0x8b, 0x15 };
+	unsigned const char retq[] = { RET_INSN_OPCODE, INT3_INSN_OPCODE };
 	union ftrace_op_code_union op_ptr;
 	int ret;
 
@@ -364,15 +364,10 @@ create_trampoline(struct ftrace_ops *ops, unsigned int *tramp_size)
 		goto fail;
 
 	ip = trampoline + size;
-
-	/* The trampoline ends with ret(q) */
-	retq = (unsigned long)ftrace_stub;
 	if (cpu_feature_enabled(X86_FEATURE_RETHUNK))
 		memcpy(ip, text_gen_insn(JMP32_INSN_OPCODE, ip, &__x86_return_thunk), JMP32_INSN_SIZE);
 	else
-		ret = copy_from_kernel_nofault(ip, (void *)retq, RET_SIZE);
-	if (WARN_ON(ret < 0))
-		goto fail;
+		memcpy(ip, retq, sizeof(retq));
 
 	/* No need to test direct calls on created trampolines */
 	if (ops->flags & FTRACE_OPS_FL_SAVE_REGS) {



[Index of Archives]     [Linux Kernel]     [Kernel Development Newbies]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite Hiking]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux