Re: [LSF/MM/BPF TOPIC] Batteries-included symbolization with blazesym

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Daniel,

On Mon, Feb 27, 2023 at 09:34:30PM +0000, Daniel Müller wrote:
> Hi Daniel,
> 
> On Mon, Feb 27, 2023 at 01:07:48PM -0700, Daniel Xu wrote:
> > On Mon, Feb 27, 2023 at 07:34:56PM +0000, Daniel Müller wrote:
> > > Symbolization of addresses is a commonly encountered problem, maybe most so in
> > > the context of BPF and tracing with the capturing of stack traces. Perhaps
> > > superficially straightforward-looking, there a variety of considerations and
> > > intricacies, such as:
> > > - different formats/standards (e.g., ELF symbol information, DWARF, GSYM) cater
> > >   to different use cases and require vastly different steps to work with
> > >   - on top of that, even if a library such as libelf or libdwarf is relied on,
> > >     plenty of format specific details need to be known to symbolize addresses
> > >     properly
> > > - discovery of symbolization sources (e.g., DWARF debug files)
> > > - symbolization trade-offs (performance, memory usage)
> > > - system-specific details and corner cases
> > > 
> > > We are working on blazesym [0], a library that aims to provide users with a
> > > batteries-included experience for symbolizing addresses (but also the reverse:
> > > mapping symbols to addresses).
> > > 
> > > We would like to provide a brief overview of the library and its goals and then
> > > open up for discussion. Some topics we are specifically interested in
> > > understanding better:
> > > - What are current issues with symbolization that would be great to support?
> > > - Does the usage of Rust pose a problem in your context? (C bindings are
> > >   available, but a Rust toolchain is required for building; are pre-built
> > >   binaries and packages for common distributions sufficient for your use cases?)
> > > 
> > > In general, we'd be interested in hearing your use cases and in discussing
> > > whether blazesym is a fit or could be made to work.
> > 
> > I didn't look super close at blazesym yet, but was wondering if it would
> > support a use case I have in mind.
> > 
> > Context is it's tricky to determine why a packet was dropped by kernel.
> > kfree_skb_reason() with caller address in `location` is a good start but
> > we can do better I think.
> > 
> > The issue is the call stack alone is not enough detail. I want to see
> > all the branches taken in the case a single call frame has multiple ways
> > to drop.
> > 
> > Vague idea is to use the recent LBR work (also haven't looked hard yet,
> > so this may not be possible) to take LBR stack at
> > `tracepoint:skb:kfree_skb` tracepoint. Then map the branches to line
> > numbers.
> > 
> > So my question is this: can/will blazesym be able to map kernel
> > addresses to line numbers / file names?
> 
> Blazesym should be able to help with the symbolization aspect, yes. That is, it
> can convert the addresses you captured into symbol name + source file + line
> information as you asked for (you may need DWARF debug information for anything
> beyond mere symbol names). In general, the library is able to handle both user
> space and kernel addresses.

Awesome, sounds great. After looking slightly more carefully, how about
split debug info support and debuginfod support? Extremely unlikely
anybody ships production kernels with debug symbols. But debuginfod
service is more likely.

> It is not designed, however, to help you capture those addresses. So how you get
> them (e.g., using LBR as you mentioned) is up to you.

Makes sense.

Thanks,
Daniel



[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux