On Fri, Jul 3, 2020 at 1:13 AM Hao Luo <haoluo@xxxxxxxxxx> wrote: > > On SMP systems, the global percpu variables are placed in a special > '.data..percpu' section, which is stored in a segment whose initial > address is set to 0, the addresses of per-CPU variables are relative > positive addresses [1]. > > This patch extracts these variables from vmlinux and places them with > their type information in BTF. More specifically, when BTF is encoded, > we find the index of the '.data..percpu' section and then traverse > the symbol table to find those global objects which are in this section. > For each of these objects, we push a BTF_KIND_VAR into the types buffer, > and a BTF_VAR_SECINFO into another buffer, percpu_secinfo. When all the > CUs have finished processing, we push a BTF_KIND_DATASEC into the > btfe->types buffer, followed by the percpu_secinfo's content. > > In a v5.8-rc3 linux kernel, I was able to extract 288 such variables. > The build time overhead is small and the space overhead is also small. > > Testing: > > - vmlinux size has increased by ~12kb. > Before: > $ readelf -SW vmlinux | grep BTF > [25] .BTF PROGBITS ffffffff821a905c 13a905c 2d2bf8 00 > After: > $ pahole -J vmlinux > $ readelf -SW vmlinux | grep BTF > [25] .BTF PROGBITS ffffffff821a905c 13a905c 2d5bca 00 > > - Common global percpu VARs and DATASEC are found in BTF section. > $ bpftool btf dump file vmlinux | grep runqueues > [14152] VAR 'runqueues' type_id=13778, linkage=global-alloc > > $ bpftool btf dump file vmlinux | grep 'cpu_stopper' > [17582] STRUCT 'cpu_stopper' size=72 vlen=5 > [17601] VAR 'cpu_stopper' type_id=17582, linkage=static > > $ bpftool btf dump file vmlinux | grep ' DATASEC ' > [63652] DATASEC '.data..percpu' size=179288 vlen=288 > > - Tested bpf selftests. > > References: > [1] https://lwn.net/Articles/531148/ > Signed-off-by: Hao Luo <haoluo@xxxxxxxxxx> > --- Looks good, your concern about missing type information doesn't seem to happen in my case. I really-really think we should make name and size checks hard errors, please update code to fail in that case with a descriptive message. With that: Acked-by: Andrii Nakryiko <andriin@xxxxxx> Tested-by: Andrii Nakryiko <andriin@xxxxxx> > Changelog since v4: > - Following Andrii's suggestion, use elf_sym__name and elf_sym__size > to retrieve variable's name and size. If the symbol's name is not > found in strings, insert it. > - Simplified the validation of symbol name. > > Changelog since v3: > - Correct the size of the DATASEC, which was previously 0 and now is > the last var_secinfo's offset + size. > - Add several sanity checks on VARs and DATASEC to ensure that the > generated BTF will not be rejected by verifier. > - Tested through a set of selftests and bpf_tracing programs. > > Changelog since v2: > - Move finding percpu_shndx and extracting symtab into btfe creation, > so we don't have to allocate a new symtab for each CU. > - More debug msg by logging the vars encoded in 'verbose' mode. We > probably don't want to log the symbols that are _not_ encoded, > since that would be too verbose. > - Calculate var offsets using 'addr - shdr.sh_addr', so it could be > generalized to other sections in future. > - Filter out the symbols that are not STT_OBJECT. > - Sort var_secinfos in the DATASEC by their offsets. > - Free 'persec_secinfo' buffer and 'symtab' in btfe deletion. > - Replace the string ".data..percpu" with a constant PERCPU_SECTION. > > Changelog since v1: > - Add a ".data..percpu" DATASEC that encodes the found VARs. > - Use percpu section's shndx to find the symbols that are percpu variables. > - Use the correct type to set VAR's linkage. > > btf_encoder.c | 168 ++++++++++++++++++++++++++++++++++++++++++++++++++ > libbtf.c | 127 ++++++++++++++++++++++++++++++++++++++ > libbtf.h | 12 ++++ > pahole.c | 1 + > 4 files changed, 308 insertions(+) > > diff --git a/btf_encoder.c b/btf_encoder.c > index df16ba0..8953d1b 100644 > --- a/btf_encoder.c > +++ b/btf_encoder.c > @@ -16,8 +16,43 @@ > #include "elf_symtab.h" > #include "btf_encoder.h" > > +#include <ctype.h> /* for isalpha() and isalnum() */ > #include <inttypes.h> > > +/* > + * This corresponds to the same macro defined in > + * include/linux/kallsyms.h > + */ > +#define KSYM_NAME_LEN 128 > + > +static bool btf_name_char_ok(char c, bool first) > +{ > + if (c == '_' || c == '.') > + return true; > + > + return first ? isalpha(c) : isalnum(c); > +} > + > +/* Check wether the given name is valid in vmlinux btf. */ typo: whether > +static bool btf_name_valid(const char *p) > +{ > + const char *limit; > + > + if (!btf_name_char_ok(*p, true)) > + return false; > + [...] > + addr = elf_sym__value(&sym); > + /* > + * Store only those symbols that have allocated space in the percpu section. > + * This excludes the following three types of symbols: > + * > + * 1. __ADDRESSABLE(sym), which are forcely emitted as symbols. > + * 2. __UNIQUE_ID(prefix), which are introduced to generate unique ids. > + * 3. __exitcall(fn), functions which are labeled as exit calls. > + * > + * In addition, the variables defined using DEFINE_PERCPU_FIRST are > + * also not included, which currently includes: > + * > + * 1. fixed_percpu_data > + */ > + if (!addr) > + continue; > + var = hashaddr__find_variable(hash_addr, addr); > + if (var == NULL) > + continue; > + > + sym_name = elf_sym__name(&sym, btfe->symtab); > + if (!btf_name_valid(sym_name)) { > + if (verbose) > + printf("invalid symbol name '%s' for btf.\n", > + sym_name); > + continue; > + } > + name = strings__add(strings, sym_name); > + type = var->ip.tag.type + type_id_off; > + size = elf_sym__size(&sym); > + if (!size) { > + if (verbose) > + printf("zero symbol size for symbol '%s'.\n", > + sym_name); > + continue; > + } Let's please make all the above checks (size and name) hard errors, they shouldn't happen. If they happen, we need to be aware and understand why, before we start ignoring them. I think this is very important. > + > + if (verbose) > + printf("symbol '%s' of address 0x%lx encoded\n", > + sym_name, addr); > + > + /* add a BTF_KIND_VAR in btfe->types */ > + linkage = var->external ? BTF_VAR_GLOBAL_ALLOCATED : BTF_VAR_STATIC; > + btf_var_id = btf_elf__add_var_type(btfe, type, name, linkage); > + if (btf_var_id < 0) { > + err = -1; > + printf("error: failed to encode variable '%s'\n", sym_name); > + break; > + } > + > + /* [...]