Em Tue, Mar 15, 2022 at 09:37:46AM -0700, Stephen Brennan escreveu: Yonghong Song <yhs@xxxxxx> writes: > > On 3/14/22 12:09 AM, Shung-Hsi Yu wrote: > >> On Wed, Mar 09, 2022 at 03:20:47PM -0800, Stephen Brennan wrote: > >>> I've been recently learning about BTF with a keen interest in using it > >>> as a fallback source of debug information. On the face of it, Linux > >>> kernels these days have a lot of introspection information. BTF provides > >>> information about types. kallsyms provides information about symbol > >>> locations. ORC allows us to reliably unwind stack traces. So together, > >>> these could enable a debugger (either postmortem, or live) to do a lot > >>> without needing to read the (very large) DWARF debuginfo files. For > >>> example, we could format backtraces with function names, we could > > For backtraces with function names, you probably still need ksyms since > > BTF won't encode address => symbol translation. > Yes, kallsyms is definitely required in this scheme. In practice, it > seems very common for distributions to be compiled not just with > CONFIG_KALLSYMS, but CONFIG_KALLSYMS_ALL. > Kallsyms is critical for mapping names to addresses (and vice versa). > >>> pretty-print global variables and data structures, etc. This is nice > > This indeed is a potential use case. > > We discussed this during adding per-cpu > > global variables. Ultimately we just added per-cpu global variables > > since we didn't have a use case or request for other global variables. > > But I still would like to know beyond this whether you have other needs > > which BPF may or may not help. It would be good to know since if > > ultimately you still need dwarf, then it might be undesirable to > > add general global variables to BTF. > I think that kallsyms, BTF, and ORC together will be enough to provide a > lite debugging experience. Some things will be missing: > - mapping backtrace addresses to source code lines So, BTF has provisions for that, and its present in the eBPF programs, perf annotate uses it, see tools/perf/util/annotate.c, symbol__disassemble_bpf(), it goes like: struct bpf_prog_linfo *prog_linfo = NULL; info_node = perf_env__find_bpf_prog_info(dso->bpf_prog.env, dso->bpf_prog.id); if (!info_node) { ret = SYMBOL_ANNOTATE_ERRNO__BPF_MISSING_BTF; goto out; } info_linear = info_node->info_linear; sub_id = dso->bpf_prog.sub_id; info.buffer = (void *)(uintptr_t)(info_linear->info.jited_prog_insns); info.buffer_length = info_linear->info.jited_prog_len; if (info_linear->info.nr_line_info) prog_linfo = bpf_prog_linfo__new(&info_linear->info); addr = pc + ((u64 *)(uintptr_t)(info_linear->info.jited_ksyms))[sub_id]; count = disassemble(pc, &info); if (prog_linfo) linfo = bpf_prog_linfo__lfind_addr_func(prog_linfo, addr, sub_id, nr_skip); if (linfo && btf) { srcline = btf__name_by_offset(btf, linfo->line_off); nr_skip++; } else srcline = NULL; etc. Having this for the kernel proper is thus doable, but then we go on making BTF info grow. Perhaps having this as optional, distros or appliances wanting to have a kernel with this extra info would add it and then tools would use it if available? > - intelligent stack frame information from DWARF CFI (e.g. > register/variable values) > - probably other things, I'm not a DWARF expert. > However, I do have two interesting branches of drgn which demonstrate > the utility of just BTF+kallsyms: > 1. https://github.com/osandov/drgn/pull/162 > 2. https://github.com/brenns10/drgn/tree/kallsyms_plus_btf > #1 adds preliminary BTF support, and #2 adds basic kallsyms support, > building on #1. Finally, I have some unpublished patches which add some > symbols into vmcoreinfo, which help us locate kallsyms info. From there, > drgn is able to take a core dump, and lookup symbols and get their > corresponding type info! > The only real blocker I see here is that the BTF data is mainly limited > to functions, so most of what you're doing is looking up function names > and viewing their signatures :) > >>> given that depending on your distro, it might be tough to get debuginfo, > >>> and it is quite large to download or install. > >>> > >>> As I've worked toward this goal, I discovered that while the > >>> BTF_KIND_VAR exists [1], the BTF included in the core kernel only has > >>> declarations for percpu variables. This makes BTF much less useful for > >>> this (admittedly odd) use case. Without a way to bind a name found in > >>> kallsyms to its type, we can't interpret global variables. It looks like > >>> the restriction for percpu-only variables is baked into the pahole BTF > >>> encoder [2]. > >>> [1]: https://www.kernel.org/doc/html/latest/bpf/btf.html#btf-kind-var > >>> [2]: https://github.com/acmel/dwarves/blob/master/btf_encoder.c > >>> I wonder what the BPF / BTF community's thoughts are on including more > >>> of these global variable declarations? Perhaps behind a > >>> CONFIG_DEBUG_INFO_BTF_ALL, like how kallsyms does it? I'm aware that > > Currently on my local machine, the vmlinux BTF's size is 4.2MB and > > adding 1MB would be a big increase. CONFIG_DEBUG_INFO_BTF_ALL is a good > > idea. But we might be able to just add global variables without this > > new config if we have strong use case. > And unfortunately 1MiB is really just a shot in the dark, guessing > around 70k variables with no string data. Maybe we can have a separate BTF file with all this extra info that could be fetched from somewhere, keyed by build-id, like is now possible with debuginfod and DWARF? > I'd love to use kallsyms to avoid adding new strings into BTF. If the > "all variables BTF" config added a dependency on "CONFIG_KALLSYMS_ALL", > then we could use the BTF "kind_flag" to indicate that string values > should be looked up in the kallsyms table, not the BTF strings section. > This could even be used to reduce the string footprint for BTF > function names. > Of course it's a more complex change to dwarves :( > >>> each declaration costs at least 16 bytes of BTF records, plus the > >>> strings and any necessary type data. The string cost could be mitigated > >>> by allowing "name_off" to refer to the kallsyms offset for variable or > >>> function declaration. But the additional records could cost around 1MiB > >>> for common distribution configurations. > >>> > >>> I know this isn't the designed use case for BTF, but I think it's very > >>> exciting. > >> > >> I've been wondering about the same (possibility of using BTF for postmortem > >> debugging without debuginfo), though not to the extend that you've > >> researched. > >> > >> I find the idea exciting as well, and quite useful for distros where the > >> kernel package changes quite often that the debuginfo package may be long > >> gone by the time a crash dump for such kernel is captured. > > > > I would love to use BTF (including global variables in BTF) for crash > > dump. But I suspect we may still have some gaps. Maybe you can > > explore a little bit more on this? > > Hopefully my above explanation gives more context here. There is code > (not production-ready) which can make use of these features together. > The next step for me has been trying to get the dwarves/pahole BTF > encoder to output *all* functions but I've hit some issues with it. If I > can get that to work, then I can present a full demo of these pieces > working together and we can be confident that there are no gaps. > > Maybe this is a topic worth discussing at LSF/MM/BPF conference? Though > it's quite late for that... > > Thanks, > Stephen > > > > >> > >> Shung-Hsi > >> > >>> Thanks for your attention! > >>> Stephen > >> -- - Arnaldo