On Tue, Mar 20, 2018 at 8:44 PM, Thomas Gleixner <tglx@xxxxxxxxxxxxx> wrote: > On Tue, 20 Mar 2018, Daniel Vetter wrote: > >> Sometimes it's really easy to know which object has gone boom and >> where the offending code is, and sometimes it's really hard. One case >> we're trying to hunt down is when module unload catches a live debug >> object, with a module with lots of them. >> >> Capture a full stack trace from debug_object_init() and dump it when >> there's a problem. >> >> FIXME: Should we have a separate Kconfig knob for the backtraces, >> they're quite expensive? Atm I'm just selecting it for the general >> debug object stuff. > > Just make it boot time enabled. We really want to be able to switch it off. > >> +#define ODEBUG_STACKDEPTH 32 > > Do we really need it that deep? I doubt that. Entirely arbitrary, I just needed this to start hunting a rare crash we're seeing in CI and stitched something together. I agree we probably don't need that much. >> +static noinline depot_stack_handle_t save_stack(struct debug_obj *obj) >> +{ >> + unsigned long entries[ODEBUG_STACKDEPTH]; >> + struct stack_trace trace = { >> + .entries = entries, >> + .max_entries = ODEBUG_STACKDEPTH, >> + .skip = 2 >> + }; >> + >> + save_stack_trace(&trace); >> + if (trace.nr_entries != 0 && >> + trace.entries[trace.nr_entries-1] == ULONG_MAX) >> + trace.nr_entries--; >> + >> + /* May be called under spinlock, so avoid sleeping */ >> + return depot_save_stack(&trace, GFP_NOWAIT); > > I really don't like the idea of unconditional allocations in that code > path. We went great length to make it perform halfways sane with the > pool. Though we should actually have per cpu pools to avoid the lock > contention, but thats a different problem. > > I'd rather make the storage a fixed size allocation which is appended > to the debug object. Something like this: > > struct debug_obj_trace { > struct stack_trace trace; > unsigned long entries[ODEBUG_STACKDEPTH]; > }; > > struct debug_obj { > unsigned int astate; > void *object; > struct debug_obj_descr *descr; > struct debug_obj_trace trace[0]; > }; > > void __init debug_objects_mem_init(void) > { > unsigned long objsize = sizeof (struct debug_obj); > > if (!debug_objects_enabled) > return; > > if (debug_objects_trace_stack) > objsize += sizeof(struct debug_obj_trace); > > obj_cache = kmem_cache_create("debug_objects_cache", objsize, 0, > SLAB_DEBUG_OBJECTS, NULL); > > __debug_object_init(void *addr, struct debug_obj_descr *descr, int onstack) > { > .... > case ODEBUG_STATE_NONE: > case ODEBUG_STATE_INIT: > case ODEBUG_STATE_INACTIVE: > if (debug_object_trace_stack) { > obj->trace[0].trace.entries = obj->trace[0].entries; > save_stack_trace(&trace[0].trace); > } > > That means we need to size the static objects which are used during early > boot with stack under the assumption that stack tracing is enabled, but > that's simple enough. > > Hmm? Yeah looks reasonable, and gives us an easy Kconfig/boot-time option to enable/disable it. I'm a bit swamped, but I'll try to respin. Thanks very much for the look and suggesting a cleaner integration approach - the allocations and recursion into the stack depot really did not result in clean code. Thanks, Daniel > Thanks, > > tglx -- Daniel Vetter Software Engineer, Intel Corporation +41 (0) 79 365 57 48 - http://blog.ffwll.ch _______________________________________________ Intel-gfx mailing list Intel-gfx@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/intel-gfx