On Tue, 23 May 2017 14:10:09 -0700 "Paul E. McKenney" <paulmck@xxxxxxxxxxxxxxxxxx> wrote: > On Tue, May 23, 2017 at 04:38:53PM -0400, Steven Rostedt wrote: > > On Tue, 23 May 2017 13:00:35 -0700 > > "Paul E. McKenney" <paulmck@xxxxxxxxxxxxxxxxxx> wrote: > > > > > > > > > Unfortunately, it does not work, as I should have known ahead > > > > > of time from the dyntick-idle experience. Not all context > > > > > switches go through context_switch(). :-/ > > > > > > > > Wait. What context switch doesn't go through a context switch? > > > > Or do you mean a user/kernel context switch? > > > > > > I mean that putting printk() before and after the call to > > > context_switch() can show tasks switching out twice without > > > switching in and vice versa. No sign of lost printk()s, and I > > > also confirmed this behavior using a flag in task_struct. > > > > I hope you meant trace_printk()s' as printk is a huge overhead and > > can cause side effects. > > Not so much during boot. But actually, I meant to ask you about > that... > > >From what I can see from the ftrace documentation, booting with > >something > like this: > > ftrace=function > ftrace_filter=tasks_rcu_qs,tasks_rcu_qs_enter,tasks_rcu_qs_exit After the machine is booted, can you make sure those exist for tracing? # grep rcu_qs /sys/kernel/debug/tracing/available_filter_functions > > Should enable ftrace, but only on the three functions called out. > But when I try this, I get the following in dmesg: > > [ 1.506171] ftrace bootup tracer 'function' not registered Can you send me your config. -- Steve > > And I don't get anything from ftrace_dump() later on. > > What am I doing wrong here? (Event tracing has worked for me in the > past from the boot line, but I was lazy so just fell back to printk(). > And I didn't think of trace_printk().) > > > > One way that this can happen on some architectures is via the > > > "helper" mechanism, where the task sleeps normally, but where a > > > later interrupt or exception takes on its context "behind the > > > scenes" in the arch code. This is what messed up my attempt to > > > use a simple interrupt-nesting counter for RCU dynticks some > > > years back. What I counted on there was that the idle loop would > > > never do that sort of thing, so I could zero the count when > > > entering idle from process context. > > > > > > But I have not yet found a similar trick for counting voluntary > > > context switches. > > > > > > I also tried making context_switch() look like a momentary > > > quiescent state, but of course that means that tasks that block > > > forever also block the grace period forever. At which point, I > > > need to scan the task list to find them. And that pretty much > > > brings me back to the current RCU-tasks implementation. :-/ > > > > Nothing should block in a preempted state forever, and if it does, > > that means we want to wait forever. Because it could be preempted > > on the trampoline. > > Blocking in a preempted state is not the problem here. Given that > the obvious hooks don't seem to be catching all of the switch-to and > switch-from events, blocking forever in a not-preempted state is > the problem. I either need some way to see all of the switch-from > and switch-to events (and the ways I can see to do this have > patch-size and maintainability issues), or I need to go back to > scanning the task list. > > And of course, all of the approaches that update state upon context > switch are slowing down a fastpath for the benefit of a slowpath, > which is not necessarily all that good of a thing. > > Thanx, Paul