On 06/01/2017, 02:17 PM, Peter Zijlstra wrote: > On Thu, Jun 01, 2017 at 06:58:20AM -0500, Josh Poimboeuf wrote: >>> Being able to generate more optimal code in the hottest code paths of the kernel >>> is the _real_, primary upstream kernel benefit of a different debuginfo method - >>> which has to be weighed against the pain of introducing a new unwinder. But this >>> submission does not talk about that aspect at all, which should be fixed I think. >> >> Actually I devoted an entire one-sentence paragraph to performance in >> the documentation: >> >> The simpler debuginfo format also enables the unwinder to be relatively >> fast, which is important for perf and lockdep. >> >> But I'll try to highlight that a little more. > > That's relative to a DWARF unwinder. It doesn't appear to be possible to > get anywhere near a frame-pointer unwinder due to having to do this > log(n) lookup for every single frame. This is ~ 20 times faster than my DWARF unwinder by a quick measurement (20000 calls to save_stack_trace via single vfs_write). perf profile, if you care: __save_stack_trace | |--65.89%--unwind_next_frame | | | |--53.64%--__undwarf_lookup | | | --5.30%--deref_stack_reg | | | --2.32%--stack_access_ok | |--24.17%--__unwind_start | | | |--21.52%--unwind_next_frame | | | | | |--14.24%--__undwarf_lookup | | | | | --2.98%--deref_stack_reg | | | | | --1.32%--stack_access_ok | | | --1.32%--get_stack_info | | | --0.66%--in_task_stack | |--3.31%--unwind_get_return_address | __kernel_text_address | | | |--0.99%--is_ftrace_trampoline | | | |--0.99%--__is_insn_slot_addr | | | | | --0.66%--__rcu_read_unlock | | | --0.66%--is_bpf_text_address | --1.66%--save_stack_address -- js suse labs -- To unsubscribe from this list: send the line "unsubscribe live-patching" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html