On Wed, May 28, 2014 at 12:18:32PM -0400, Steven Rostedt wrote: > On Wed, 28 May 2014 15:53:58 +0900 > Minchan Kim <minchan@xxxxxxxxxx> wrote: > > > While I played with my own feature(ex, something on the way to reclaim), > > kernel went to oops easily. I guessed reason would be stack overflow > > and wanted to prove it. > > > > I found stack tracer which would be very useful for me but kernel went > > oops before my user program gather the information via > > "watch cat /sys/kernel/debug/tracing/stack_trace" so I couldn't get an > > stack usage of each functions. > > > > What I want was that emit the kernel stack usage when kernel goes oops. > > > > This patch records callstack of max stack usage into ftrace buffer > > right before Oops and print that information with ftrace_dump_on_oops. > > At last, I can find a culprit. :) > > > > This is not dependent on patch 2/2, nor is 2/2 dependent on this patch, > I'll review this as if 2/2 does not exist. Yeb, Thanks! > > > > Signed-off-by: Minchan Kim <minchan@xxxxxxxxxx> > > --- > > kernel/trace/trace_stack.c | 32 ++++++++++++++++++++++++++++++-- > > 1 file changed, 30 insertions(+), 2 deletions(-) > > > > diff --git a/kernel/trace/trace_stack.c b/kernel/trace/trace_stack.c > > index 5aa9a5b9b6e2..5eb88e60bc5e 100644 > > --- a/kernel/trace/trace_stack.c > > +++ b/kernel/trace/trace_stack.c > > @@ -51,6 +51,30 @@ static DEFINE_MUTEX(stack_sysctl_mutex); > > int stack_tracer_enabled; > > static int last_stack_tracer_enabled; > > > > +static inline void print_max_stack(void) > > +{ > > + long i; > > + int size; > > + > > + trace_printk(" Depth Size Location" > > + " (%d entries)\n" > > Please do not break strings just to satisfy that silly 80 character > limit. Even Linus Torvalds said that's pretty stupid. I just copied existing code from trace_stack.c. Okay, I will fix. :) > > Also, do not use trace_printk(). It is not made to be included in a > production kernel. It reserves special buffers to make it as fast as > possible, and those buffers should not be created in production > systems. In fact, I will probably add for 3.16 a big warning message > when trace_printk() is used. > > Since this is a bug, why not just use printk() instead? Thanks for the info. I will use printk(KERN_EMERG). > > BTW, wouldn't this this function crash as well if the stack is already > bad? It wasn't crashed until code has data from corrupted threadinfo. > > -- Steve > > > + " ----- ---- --------\n", > > + max_stack_trace.nr_entries - 1); > > + > > + for (i = 0; i < max_stack_trace.nr_entries; i++) { > > + if (stack_dump_trace[i] == ULONG_MAX) > > + break; > > + if (i+1 == max_stack_trace.nr_entries || > > + stack_dump_trace[i+1] == ULONG_MAX) > > + size = stack_dump_index[i]; > > + else > > + size = stack_dump_index[i] - stack_dump_index[i+1]; > > + > > + trace_printk("%3ld) %8d %5d %pS\n", i, stack_dump_index[i], > > + size, (void *)stack_dump_trace[i]); > > + } > > +} > > + > > static inline void > > check_stack(unsigned long ip, unsigned long *stack) > > { > > @@ -149,8 +173,12 @@ check_stack(unsigned long ip, unsigned long *stack) > > i++; > > } > > > > - BUG_ON(current != &init_task && > > - *(end_of_stack(current)) != STACK_END_MAGIC); > > + if ((current != &init_task && > > + *(end_of_stack(current)) != STACK_END_MAGIC)) { > > + print_max_stack(); > > + BUG(); > > + } > > + > > out: > > arch_spin_unlock(&max_stack_lock); > > local_irq_restore(flags); > > -- > To unsubscribe, send a message with 'unsubscribe linux-mm' in > the body to majordomo@xxxxxxxxx. For more info on Linux MM, > see: http://www.linux-mm.org/ . > Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a> -- Kind regards, Minchan Kim -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>