On Tue, Jul 9, 2024 at 3:11 AM Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote: > > On Mon, Jul 08, 2024 at 04:11:27PM -0700, Andrii Nakryiko wrote: > > +#ifdef CONFIG_UPROBES > > +/* > > + * Heuristic-based check if uprobe is installed at the function entry. > > + * > > + * Under assumption of user code being compiled with frame pointers, > > + * `push %rbp/%ebp` is a good indicator that we indeed are. > > + * > > + * Similarly, `endbr64` (assuming 64-bit mode) is also a common pattern. > > + * If we get this wrong, captured stack trace might have one extra bogus > > + * entry, but the rest of stack trace will still be meaningful. > > + */ > > +static bool is_uprobe_at_func_entry(struct pt_regs *regs) > > +{ > > + struct arch_uprobe *auprobe; > > + > > + if (!current->utask) > > + return false; > > + > > + auprobe = current->utask->auprobe; > > + if (!auprobe) > > + return false; > > + > > + /* push %rbp/%ebp */ > > + if (auprobe->insn[0] == 0x55) > > + return true; > > + > > + /* endbr64 (64-bit only) */ > > + if (user_64bit_mode(regs) && *(u32 *)auprobe->insn == 0xfa1e0ff3) > > + return true; > > I meant to reply to Josh suggesting this, but... how can this be? If you > scribble the ENDBR with an INT3 things will #CP and we'll never get to > the #BP. Well, it seems like it works in practice, I just tried. Here's the disassembly of the function: 00000000000019d0 <urandlib_api_v1>: 19d0: f3 0f 1e fa endbr64 19d4: 55 pushq %rbp 19d5: 48 89 e5 movq %rsp, %rbp 19d8: 48 83 ec 10 subq $0x10, %rsp 19dc: 48 8d 3d fe ed ff ff leaq -0x1202(%rip), %rdi # 0x7e1 <__isoc99_scanf+0x7e1> 19e3: 48 8d 75 fc leaq -0x4(%rbp), %rsi 19e7: b0 00 movb $0x0, %al 19e9: e8 f2 00 00 00 callq 0x1ae0 <__isoc99_scanf+0x1ae0> 19ee: b8 01 00 00 00 movl $0x1, %eax 19f3: 48 83 c4 10 addq $0x10, %rsp 19f7: 5d popq %rbp 19f8: c3 retq 19f9: 0f 1f 80 00 00 00 00 nopl (%rax) And here's the state when uprobe is attached: (gdb) disass/r urandlib_api_v1 Dump of assembler code for function urandlib_api_v1: 0x00007ffb734e39d0 <+0>: cc int3 0x00007ffb734e39d1 <+1>: 0f 1e fa nop %edx 0x00007ffb734e39d4 <+4>: 55 push %rbp 0x00007ffb734e39d5 <+5>: 48 89 e5 mov %rsp,%rbp 0x00007ffb734e39d8 <+8>: 48 83 ec 10 sub $0x10,%rsp 0x00007ffb734e39dc <+12>: 48 8d 3d fe ed ff ff lea -0x1202(%rip),%rdi # 0x7ffb734e27e1 0x00007ffb734e39e3 <+19>: 48 8d 75 fc lea -0x4(%rbp),%rsi => 0x00007ffb734e39e7 <+23>: b0 00 mov $0x0,%al 0x00007ffb734e39e9 <+25>: e8 f2 00 00 00 call 0x7ffb734e3ae0 <__isoc99_scanf@plt> 0x00007ffb734e39ee <+30>: b8 01 00 00 00 mov $0x1,%eax 0x00007ffb734e39f3 <+35>: 48 83 c4 10 add $0x10,%rsp 0x00007ffb734e39f7 <+39>: 5d pop %rbp 0x00007ffb734e39f8 <+40>: c3 ret You can see it replaced the first byte, the following 3 bytes are remnants of endb64 (gdb says it's a nop? :)), and then we proceeded, you can see I stepped through a few more instructions. Works by accident? But either way, if we prevent uprobe to be placed on end64 that will essentially break any code that does compile with endbr64 (-fcf-protection=branch), which is very not great (I suspect most people that care would just disable that option in such a case). > > Also, we tried very hard to not have a literal encode ENDBR (I really > should teach objtool about this one :/). If it somehow makes sense to > keep this clause, please use: gen_endbr() I'll just use is_endbr(), no problem.