On Fri, Jan 29, 2021 at 8:24 AM Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote: > > On Fri, Jan 29, 2021 at 10:59:52AM -0500, Steven Rostedt wrote: > > On Fri, 29 Jan 2021 22:40:11 +0900 > > Masami Hiramatsu <mhiramat@xxxxxxxxxx> wrote: > > > > > > So what, they can all happen with random locks held. Marking them as NMI > > > > enables a whole bunch of sanity checks that are entirely appropriate. > > > > > > How about introducing an idea of Asynchronous NMI (ANMI) and Synchronous > > > NMI (SNMI)? kprobes and ftrace is synchronously called and can be controlled > > > (we can expect the context) but ANMI may be caused by asynchronous > > > hardware events on any context. > > > > > > If we can distinguish those 2 NMIs on preempt count, bpf people can easily > > > avoid the inevitable situation. > > > > I don't like the name NMI IN SNMI, because they are not NMIs. They are > > actually more like kernel exceptions. Even page faults in the kernel is > > similar to a kprobe breakpoint or ftrace. It can happen anywhere, with any > > lock held. Perhaps we need a kernel exception context? Which by definition > > is synchronous. I like 'kernel exception' name. SNMI doesn't sound right. There is nothing 'non maskable' here. > > What problem are you trying to solve? AFAICT all these contexts have the > same restrictions, why try and muck about with different names for the > same thing? from core kernel perspective the difference between 'kernel exception' and true NMI is huge: this_cpu vs __this_cpu static checks vs runtime checks Same things apply to bpf side. We can statically prove safety for ftrace and kprobe attaching whereas to deal with NMI situation we have to use run-time checks for recursion prevention, etc.