On Mon, Sep 13, 2021 at 12:24:24PM +0200, Peter Zijlstra wrote: FWIW: > I'm confused tho; where does the #DF come from? Because taking a #PF > from NMI should be perfectly fine. > > AFAICT that callchain is something like: > > NMI > perf_event_nmi_handler() > (part of the chain is missing here) > perf_log_throttle() > perf_output_begin() /* events/ring_buffer.c */ > rcu_read_lock() > rcu_lock_acquire() > lock_acquire() > trace_lock_acquire() --> perf_trace_foo This function also calls perf_trace_buf_alloc(), and will have incremented the recursion count, such that: > > ... > perf_callchain() > perf_callchain_user() > #PF (fully expected during a userspace callchain) > (some stuff, until the first __fentry) > perf_trace_function_call > perf_trace_buf_alloc() > perf_swevent_get_recursion_context() > *BOOM* this one, if it wouldn't mysteriously explode, would find recursion and terminate, except that seems to be going side-ways. > Now, supposedly we then take another #PF from get_recursion_context() or > something, but that doesn't make sense. That should just work... > > Can you figure out what's going wrong there? going with the RIP, this > almost looks like 'swhash->recursion' goes splat, but again that makes > no sense, that's a per-cpu variable. > >