Christophe Leroy <christophe.leroy@xxxxxxxxxx> writes: > Le 01/05/2024 à 18:29, Stephen Brennan a écrit : >> If an error happens in ftrace, ftrace_kill() will prevent disarming >> kprobes. Eventually, the ftrace_ops associated with the kprobes will be >> freed, yet the kprobes will still be active, and when triggered, they >> will use the freed memory, likely resulting in a page fault and panic. >> >> This behavior can be reproduced quite easily, by creating a kprobe and >> then triggering a ftrace_kill(). For simplicity, we can simulate an >> ftrace error with a kernel module like [1]: >> >> [1]: https://github.com/brenns10/kernel_stuff/tree/master/ftrace_killer >> >> sudo perf probe --add commit_creds >> sudo perf trace -e probe:commit_creds >> # In another terminal >> make >> sudo insmod ftrace_killer.ko # calls ftrace_kill(), simulating bug >> # Back to perf terminal >> # ctrl-c >> sudo perf probe --del commit_creds >> >> After a short period, a page fault and panic would occur as the kprobe >> continues to execute and uses the freed ftrace_ops. While ftrace_kill() >> is supposed to be used only in extreme circumstances, it is invoked in >> FTRACE_WARN_ON() and so there are many places where an unexpected bug >> could be triggered, yet the system may continue operating, possibly >> without the administrator noticing. If ftrace_kill() does not panic the >> system, then we should do everything we can to continue operating, >> rather than leave a ticking time bomb. >> >> Signed-off-by: Stephen Brennan <stephen.s.brennan@xxxxxxxxxx> >> --- >> Changes in v3: >> Don't expose ftrace_is_dead(). Create a "kprobe_ftrace_disabled" >> variable and check it directly in the kprobe handlers. > > Isn't it safer to provide a fonction rather than a direct access to a > variable ? Is the concern that other code could modify this variable? If so, then I suppose the function call is safer. But the variable is not exported and I think built-in code can be trusted not to muck with it. Maybe I'm missing your point about safety though? > By the way, wouldn't it be more performant to use a static branch (jump > label) ? I agree with Steven's concern that text modification would unfortunately not be a good way to handle an error in text modification. Especially, I believe there could be deadlock risks, as static key enablement requires taking the text_mutex and the jump_label_mutex. I'd be concerned that the text_mutex could already be held in some situations where ftrace_kill() is called. But I'm not certain about that. Thanks for taking a look! Stephen