[PATCH v2 0/3] kprobes: Fix kretprobe issues

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hello,

This is v2 series of fixing kretprobe incorrect stacking order patches.
In this version, I fixed a lack of kprobes.h including and added new
patch for kretprobe trampoline recursion issue. (and add Cc:stable)

(1) kprobe incorrct stacking order problem

On recent talk with Andrea, I started more precise investigation on
the kernel panic with kretprobes on notrace functions, which Francis
had been reported last year ( https://lkml.org/lkml/2017/7/14/466 ).

See the investigation details in 
https://lkml.kernel.org/r/154686789378.15479.2886543882215785247.stgit@devbox

When we put a kretprobe on ftrace_ops_assist_func() and put another
kretprobe on probed-function, below happens

<caller>
 -><probed-function>
   ->fentry
    ->ftrace_ops_assist_func()
     ->int3
      ->kprobe_int3_handler()
      ...->pre_handler_kretprobe()
           push the return address (*fentry*) of ftrace_ops_assist_func() to
           top of the kretprobe list and replace it with kretprobe_trampoline.
      <-kprobe_int3_handler()
     <-(int3)
     ->kprobe_ftrace_handler()
      ...->pre_handler_kretprobe()
           push the return address (caller) of probed-function to top of the
           kretprobe list and replace it with kretprobe_trampoline.
     <-(kprobe_ftrace_handler())
    <-(ftrace_ops_assist_func())
    [kretprobe_trampoline]
     ->tampoline_handler()
       pop the return address (caller) from top of the kretprobe list
     <-(trampoline_handler())
    <caller>
    [run caller with incorrect stack information]
   <-(<caller>)
  !!KERNEL PANIC!!

Therefore, this kernel panic happens only when we put 2 k*ret*probes on
ftrace_ops_assist_func() and other functions. If we put kprobes, it
doesn't cause any issue, since it doesn't change the return address.

To fix (or just avoid) this issue, we can introduce a frame pointer
verification to skip wrong order entries. And I also would like to
blacklist those functions because those are part of ftrace-based 
kprobe handling routine.

(2) kretprobe trampoline recursion problem

This was found by Andrea in the previous thread
https://lkml.kernel.org/r/20190107183444.GA5966@xps-13

----
 echo "r:event_1 __fdget" >> kprobe_events
 echo "r:event_2 _raw_spin_lock_irqsave" >> kprobe_events
 echo 1 > events/kprobes/enable
 [DEADLOCK]
----

Because kretprobe trampoline_handler uses spinlock for protecting
hash table, if we probe the spinlock itself, it causes deadlock.
Thank you Andrea and Steve for discovering this root cause!!

This bug has been introduced with the asm-coded trampoline
code, since previously it used another kprobe for hooking
the function return placeholder (which only has a nop) and
trampoline handler was called from that kprobe.

To fix this bug, I introduced a dummy kprobe and set it in
current_kprobe as we did in old days.

Thank you,

---

Masami Hiramatsu (3):
      x86/kprobes: Verify stack frame on kretprobe
      kprobes: Mark ftrace mcount handler functions nokprobe
      x86/kprobes: Fix to avoid kretprobe recursion


 arch/x86/kernel/kprobes/core.c |   48 ++++++++++++++++++++++++++++++++++++++--
 include/linux/kprobes.h        |    1 +
 kernel/trace/ftrace.c          |    6 ++++-
 3 files changed, 52 insertions(+), 3 deletions(-)

--
Masami Hiramatsu (Linaro) <mhiramat@xxxxxxxxxx>



[Index of Archives]     [Linux Kernel]     [Kernel Development Newbies]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite Hiking]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux