On Mon, Jan 24, 2022 at 2:43 PM Hao Luo <haoluo@xxxxxxxxxx> wrote: > > Dear BPF experts, > > I'm working on collecting some kernel performance data using BPF > tracing prog. Our performance profiling team wants to associate the > data with user stack information. One of the requirements is to > reliably get BuildIDs from bpf_get_stackid() and other similar helpers > [1]. > > As part of an early investigation, we found that there are a couple > issues that make bpf_get_stackid() much less reliable than we'd like > for our use: > > 1. The first page of many binaries (which contains the ELF headers and > thus the BuildID that we need) is often not in memory. The failure of > find_get_page() (called from build_id_parse()) is higher than we would > want. Our top use case of bpf_get_stack() is called from NMI, so there isn't much we can do. Maybe it is possible to improve it by changing the layout of the binary and the libraries? Specifically, if the text is also in the first page, it is likely to stay in memory? > 2. When anonymous huge pages are used to hold some regions of process > text, build_id_parse() also fails to get a BuildID because > vma->vm_file is NULL. How did the text get in anonymous memory? I guess it is NOT from JIT? We had a hack to use transparent huge page for application text. The hack looks like: "At run time, the application creates an 8MB temporary buffer and the hot section of the executable memory is copied to it. The 8MB region in the executable memory is then converted to a huge page (by way of an mmap() to anonymous pages and an madvise() to create a huge page), the data is copied back to it, and it is made executable again using mprotect()." If your case is the same (or similar), it can probably be fixed with CONFIG_READ_ONLY_THP_FOR_FS, and modified user space. Thanks, Song