Re: [PATCH v5] btf_encoder: Teach pahole to store percpu variables in vmlinux BTF.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Jul 3, 2020 at 1:13 AM Hao Luo <haoluo@xxxxxxxxxx> wrote:
>
> On SMP systems, the global percpu variables are placed in a special
> '.data..percpu' section, which is stored in a segment whose initial
> address is set to 0, the addresses of per-CPU variables are relative
> positive addresses [1].
>
> This patch extracts these variables from vmlinux and places them with
> their type information in BTF. More specifically, when BTF is encoded,
> we find the index of the '.data..percpu' section and then traverse
> the symbol table to find those global objects which are in this section.
> For each of these objects, we push a BTF_KIND_VAR into the types buffer,
> and a BTF_VAR_SECINFO into another buffer, percpu_secinfo. When all the
> CUs have finished processing, we push a BTF_KIND_DATASEC into the
> btfe->types buffer, followed by the percpu_secinfo's content.
>
> In a v5.8-rc3 linux kernel, I was able to extract 288 such variables.
> The build time overhead is small and the space overhead is also small.
>
> Testing:
>
> - vmlinux size has increased by ~12kb.
>   Before:
>    $ readelf -SW vmlinux | grep BTF
>    [25] .BTF              PROGBITS        ffffffff821a905c 13a905c 2d2bf8 00
>   After:
>    $ pahole -J vmlinux
>    $ readelf -SW vmlinux  | grep BTF
>    [25] .BTF              PROGBITS        ffffffff821a905c 13a905c 2d5bca 00
>
> - Common global percpu VARs and DATASEC are found in BTF section.
>   $ bpftool btf dump file vmlinux | grep runqueues
>   [14152] VAR 'runqueues' type_id=13778, linkage=global-alloc
>
>   $ bpftool btf dump file vmlinux | grep 'cpu_stopper'
>   [17582] STRUCT 'cpu_stopper' size=72 vlen=5
>   [17601] VAR 'cpu_stopper' type_id=17582, linkage=static
>
>   $ bpftool btf dump file vmlinux | grep ' DATASEC '
>   [63652] DATASEC '.data..percpu' size=179288 vlen=288
>
> - Tested bpf selftests.
>
> References:
>  [1] https://lwn.net/Articles/531148/
> Signed-off-by: Hao Luo <haoluo@xxxxxxxxxx>
> ---

Looks good, your concern about missing type information doesn't seem
to happen in my case. I really-really think we should make name and
size checks hard errors, please update code to fail in that case with
a descriptive message.

With that:

Acked-by: Andrii Nakryiko <andriin@xxxxxx>
Tested-by: Andrii Nakryiko <andriin@xxxxxx>

> Changelog since v4:
> - Following Andrii's suggestion, use elf_sym__name and elf_sym__size
>   to retrieve variable's name and size. If the symbol's name is not
>   found in strings, insert it.
> - Simplified the validation of symbol name.
>
> Changelog since v3:
> - Correct the size of the DATASEC, which was previously 0 and now is
>   the last var_secinfo's offset + size.
> - Add several sanity checks on VARs and DATASEC to ensure that the
>   generated BTF will not be rejected by verifier.
> - Tested through a set of selftests and bpf_tracing programs.
>
> Changelog since v2:
> - Move finding percpu_shndx and extracting symtab into btfe creation,
>   so we don't have to allocate a new symtab for each CU.
> - More debug msg by logging the vars encoded in 'verbose' mode. We
>   probably don't want to log the symbols that are _not_ encoded,
>   since that would be too verbose.
> - Calculate var offsets using 'addr - shdr.sh_addr', so it could be
>   generalized to other sections in future.
> - Filter out the symbols that are not STT_OBJECT.
> - Sort var_secinfos in the DATASEC by their offsets.
> - Free 'persec_secinfo' buffer and 'symtab' in btfe deletion.
> - Replace the string ".data..percpu" with a constant PERCPU_SECTION.
>
> Changelog since v1:
> - Add a ".data..percpu" DATASEC that encodes the found VARs.
> - Use percpu section's shndx to find the symbols that are percpu variables.
> - Use the correct type to set VAR's linkage.
>
>  btf_encoder.c | 168 ++++++++++++++++++++++++++++++++++++++++++++++++++
>  libbtf.c      | 127 ++++++++++++++++++++++++++++++++++++++
>  libbtf.h      |  12 ++++
>  pahole.c      |   1 +
>  4 files changed, 308 insertions(+)
>
> diff --git a/btf_encoder.c b/btf_encoder.c
> index df16ba0..8953d1b 100644
> --- a/btf_encoder.c
> +++ b/btf_encoder.c
> @@ -16,8 +16,43 @@
>  #include "elf_symtab.h"
>  #include "btf_encoder.h"
>
> +#include <ctype.h> /* for isalpha() and isalnum() */
>  #include <inttypes.h>
>
> +/*
> + * This corresponds to the same macro defined in
> + * include/linux/kallsyms.h
> + */
> +#define KSYM_NAME_LEN 128
> +
> +static bool btf_name_char_ok(char c, bool first)
> +{
> +       if (c == '_' || c == '.')
> +               return true;
> +
> +       return first ? isalpha(c) : isalnum(c);
> +}
> +
> +/* Check wether the given name is valid in vmlinux btf. */

typo: whether

> +static bool btf_name_valid(const char *p)
> +{
> +       const char *limit;
> +
> +       if (!btf_name_char_ok(*p, true))
> +               return false;
> +

[...]

> +               addr = elf_sym__value(&sym);
> +               /*
> +                * Store only those symbols that have allocated space in the percpu section.
> +                * This excludes the following three types of symbols:
> +                *
> +                *  1. __ADDRESSABLE(sym), which are forcely emitted as symbols.
> +                *  2. __UNIQUE_ID(prefix), which are introduced to generate unique ids.
> +                *  3. __exitcall(fn), functions which are labeled as exit calls.
> +                *
> +                * In addition, the variables defined using DEFINE_PERCPU_FIRST are
> +                * also not included, which currently includes:
> +                *
> +                *  1. fixed_percpu_data
> +                */
> +               if (!addr)
> +                       continue;
> +               var = hashaddr__find_variable(hash_addr, addr);
> +               if (var == NULL)
> +                       continue;
> +
> +               sym_name = elf_sym__name(&sym, btfe->symtab);
> +               if (!btf_name_valid(sym_name)) {
> +                       if (verbose)
> +                               printf("invalid symbol name '%s' for btf.\n",
> +                                      sym_name);
> +                       continue;
> +               }
> +               name = strings__add(strings, sym_name);
> +               type = var->ip.tag.type + type_id_off;
> +               size = elf_sym__size(&sym);
> +               if (!size) {
> +                       if (verbose)
> +                               printf("zero symbol size for symbol '%s'.\n",
> +                                      sym_name);
> +                       continue;
> +               }

Let's please make all the above checks (size and name) hard errors,
they shouldn't happen. If they happen, we need to be aware and
understand why, before we start ignoring them. I think this is very
important.

> +
> +               if (verbose)
> +                       printf("symbol '%s' of address 0x%lx encoded\n",
> +                              sym_name, addr);
> +
> +               /* add a BTF_KIND_VAR in btfe->types */
> +               linkage = var->external ? BTF_VAR_GLOBAL_ALLOCATED : BTF_VAR_STATIC;
> +               btf_var_id = btf_elf__add_var_type(btfe, type, name, linkage);
> +               if (btf_var_id < 0) {
> +                       err = -1;
> +                       printf("error: failed to encode variable '%s'\n", sym_name);
> +                       break;
> +               }
> +
> +               /*

[...]



[Index of Archives]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux