On Tue, 8 Jan 2019 11:31:01 +0100 Andrea Righi <righi.andrea@xxxxxxxxx> wrote: > On Tue, Jan 08, 2019 at 01:43:55PM +0900, Masami Hiramatsu wrote: > > Hello, > > > > This is v2 series of fixing kretprobe incorrect stacking order patches. > > In this version, I fixed a lack of kprobes.h including and added new > > patch for kretprobe trampoline recursion issue. (and add Cc:stable) > > > > (1) kprobe incorrct stacking order problem > > > > On recent talk with Andrea, I started more precise investigation on > > the kernel panic with kretprobes on notrace functions, which Francis > > had been reported last year ( https://lkml.org/lkml/2017/7/14/466 ). > > > > See the investigation details in > > https://lkml.kernel.org/r/154686789378.15479.2886543882215785247.stgit@devbox > > > > When we put a kretprobe on ftrace_ops_assist_func() and put another > > kretprobe on probed-function, below happens > > > > <caller> > > -><probed-function> > > ->fentry > > ->ftrace_ops_assist_func() > > ->int3 > > ->kprobe_int3_handler() > > ...->pre_handler_kretprobe() > > push the return address (*fentry*) of ftrace_ops_assist_func() to > > top of the kretprobe list and replace it with kretprobe_trampoline. > > <-kprobe_int3_handler() > > <-(int3) > > ->kprobe_ftrace_handler() > > ...->pre_handler_kretprobe() > > push the return address (caller) of probed-function to top of the > > kretprobe list and replace it with kretprobe_trampoline. > > <-(kprobe_ftrace_handler()) > > <-(ftrace_ops_assist_func()) > > [kretprobe_trampoline] > > ->tampoline_handler() > > pop the return address (caller) from top of the kretprobe list > > <-(trampoline_handler()) > > <caller> > > [run caller with incorrect stack information] > > <-(<caller>) > > !!KERNEL PANIC!! > > > > Therefore, this kernel panic happens only when we put 2 k*ret*probes on > > ftrace_ops_assist_func() and other functions. If we put kprobes, it > > doesn't cause any issue, since it doesn't change the return address. > > > > To fix (or just avoid) this issue, we can introduce a frame pointer > > verification to skip wrong order entries. And I also would like to > > blacklist those functions because those are part of ftrace-based > > kprobe handling routine. > > > > (2) kretprobe trampoline recursion problem > > > > This was found by Andrea in the previous thread > > https://lkml.kernel.org/r/20190107183444.GA5966@xps-13 > > > > ---- > > echo "r:event_1 __fdget" >> kprobe_events > > echo "r:event_2 _raw_spin_lock_irqsave" >> kprobe_events > > echo 1 > events/kprobes/enable > > [DEADLOCK] > > ---- > > > > Because kretprobe trampoline_handler uses spinlock for protecting > > hash table, if we probe the spinlock itself, it causes deadlock. > > Thank you Andrea and Steve for discovering this root cause!! > > > > This bug has been introduced with the asm-coded trampoline > > code, since previously it used another kprobe for hooking > > the function return placeholder (which only has a nop) and > > trampoline handler was called from that kprobe. > > > > To fix this bug, I introduced a dummy kprobe and set it in > > current_kprobe as we did in old days. > > > > Thank you, > > It looks all good to me, with this patch set I couldn't break the > kernel in any way. > > Tested-by: Andrea Righi <righi.andrea@xxxxxxxxx> Thank you, Andrea! Ingo, could you pick this series? -- Masami Hiramatsu <mhiramat@xxxxxxxxxx>