On Thu, Mar 16, 2023 at 10:50 AM Ian Rogers <irogers@xxxxxxxxxx> wrote: > > On Thu, Mar 16, 2023 at 10:35 AM Matthew Wilcox <willy@xxxxxxxxxxxxx> wrote: > > > > On Thu, Mar 16, 2023 at 06:01:40PM +0100, Jiri Olsa wrote: > > > hi, > > > this patchset adds build id object pointer to struct file object. > > > > > > We have several use cases for build id to be used in BPF programs > > > [2][3]. > > > > Yes, you have use cases, but you never answered the question I asked: > > > > Is this going to be enabled by every distro kernel, or is it for special > > use-cases where only people doing a very specialised thing who are > > willing to build their own kernels will use it? > > > > Saying "hubble/tetragon" doesn't answer that question. Maybe it does > > to you, but I have no idea what that software is. > > > > Put it another way: how does this make *MY* life better? Literally me. > > How will it affect my life? > > So at Google we use build IDs for all profiling, I believe Meta is the > same but obviously I can't speak for them. For BPF program stack Yep, Meta is also capturing stack traces with build ID as well, if possible. Build IDs help with profiling short-lived processes which exit before the profiling session is done and user-space tooling is able to collect /proc/<pid>/maps contents (which is what Ian is referring to here). But also build ID allows to offload more of the expensive stack symbolization process (converting raw memory addresses into human readable function+offset+file path+line numbers information) to dedicated remote servers, by allowing to cache and reuse preprocessed DWARF/ELF information based on build ID. I believe perf tool is also using build ID, so any tool relying on perf capturing full and complete profiling data for system-wide performance analysis would benefit as well. Generally speaking, there is a whole ecosystem built on top of assumption that binaries have build ID and profiling tooling is able to provide more value if those build IDs are more reliably collected. Which ultimately benefits the entire open-source ecosystem by allowing people to spot issues (not necessarily just performance, it could be correctness issues as well) more reliably, fix them, and benefit every user. > traces, using build ID + offset stack traces is preferable to perf's > whole system synthesis of mmap events based on data held in > /proc/pid/maps. Individual stack traces are larger, but you avoid the > ever growing problem of coming up with some initial virtual memory > state that will allow you to identify samples. > > This doesn't answer the question about how this will help you, but I > expect over time you will see scalability issues and also want to use > tools assuming build IDs are present and cheap to access. > > Thanks, > Ian