On Fri, Oct 22, 2021 at 06:45:14PM +0200, Peter Zijlstra wrote: > On Fri, Oct 22, 2021 at 09:25:02AM -0700, Kees Cook wrote: > > On Fri, Oct 22, 2021 at 05:09:35PM +0200, Peter Zijlstra wrote: > > > /** > > > * stack_trace_save_tsk - Save a task stack trace into a storage array > > > * @task: The task to examine > > > @@ -135,7 +142,6 @@ EXPORT_SYMBOL_GPL(stack_trace_save); > > > unsigned int stack_trace_save_tsk(struct task_struct *tsk, unsigned long *store, > > > unsigned int size, unsigned int skipnr) > > > { > > > - stack_trace_consume_fn consume_entry = stack_trace_consume_entry_nosched; > > > struct stacktrace_cookie c = { > > > .store = store, > > > .size = size, > > > @@ -143,11 +149,8 @@ unsigned int stack_trace_save_tsk(struct > > > .skip = skipnr + (current == tsk), > > > }; > > > > > > - if (!try_get_task_stack(tsk)) > > > - return 0; > > > + task_try_func(tsk, try_arch_stack_walk_tsk, &c); > > > > Pardon my thin understanding of the scheduler, but I assume this change > > doesn't mean stack_trace_save_tsk() stops working for "current", right? > > In trying to answer this for myself, I couldn't convince myself what value > > current->__state have here. Is it one of TASK_(UN)INTERRUPTIBLE ? > > current really shouldn't be using stack_trace_save_tsk(), and no you're > quite right, it will not work for current, irrespective of ->__state, > current will always be ->on_rq. Heh, we raced to say the same thing. :) > I started auditing stack_trace_save_tsk() users a few days ago, but > didn't look for this particular issue. I suppose I'll have to start over > with that. FWIW, this shape of thing was one of the reasons I wanted to split arch_stack_walk() into separate: * arch_stack_walk_current() * arch_stack_walk_current_regs() * arch_stack_walk_blocked_task() ... with similar applying here, since otherwise people won't consider the distinction between current / !current at the caller level, leading to junk like this. Thanks, Mark.