On 12/4/18 8:16 PM, Steven Rostedt wrote: > Yes, it's a simple fix. The problem is that the recursion detection of > the function tracer requires that when its called from interrupt, the > "in_interrupt" needs to be true, otherwise it thinks that the function > tracer is recursing on itself (which is common). > > Looking an the dropped events, and the code in __irq_enter() we have > this: > > #define __irq_enter() \ > do { \ > account_irq_enter_time(current); \ > preempt_count_add(HARDIRQ_OFFSET); \ <<-- in_interrupt() returns true here > trace_hardirq_enter(); \ > } while (0) > > Interesting enough, the dropped events happen to be in > account_irq_enter_time()! > > Thus what I believe is happening is that an interrupt came in while one > event was being recorded. When account_irq_enter_time was called, the > function tracer noticed that its recursion bit for the current context > was already set, and just dropped the event because it thought it was > just tracing itself. After we add HARDIRQ_OFFSET to preempt_count, the > "in_interrupt()" will be set and the function tracer will know its in a > new context where its safe to continue tracing. > > Can you try this patch to see if it fixes it for you? Hi Steve, I finally took some time to play the patch, sorry for the delay. I got the idea of the patch, but it is not working as expected :-(. When I enable it, the system [a VM with 1 CPU] mostly freezes when I run that: # while [ 1 ]; do echo > /dev/null; done & I still need to investigate why. The other point is that I got that the patch would start showing account_irq_enter_time(). But, as far as I understood, it would not trace the do_IRQ(). Right? Wouldn't be the case of using a per-cpu variable to set the flag right in the begin of the handler (in the entry*.s)? Thoughts? -- Daniel