Re: accessing global and per-cpu vars

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, May 28, 2020 at 1:51 PM Hao Luo <haoluo@xxxxxxxxxx> wrote:
>
> A quick update on this thread.
>
> I came up with a draft patch that fulfills step 1. I added a ".ksym" section for extern vars. The libbpf fills these vars' values by reading /proc/kallsyms at load time. I think I am going to upload this patch for review together with step 3 and 4 after I work them out.
>
> Regarding step 2, I have also worked out a patch in pahole that inserts the kernel's percpu vars into BTF. I realized, because step 2 happens at compile time, there is no kallsyms file available to extract symbols, so we have to read the global vars from vmlinux. Currently on v5.7-rc7, I was able to extract 291 percpu vars, static or global. The .BTF size increases from 2d2c10 to 2d4dd0. A clean build on my local workstation increases from 10m13s to 11m24s (wall time). Common global percpu vars can be found in .BTF.

For humans among us, that's 8640 bytes increase, it seems, not a big
deal at all. Have you checked how much would it increase if you
include not just per-cpu variables?

Also I wonder what adds more than a minute to the build process? Is it
all pahole's BTF generation step? If yes, why it's so much slower now?

>
> haoluo@haoluo:~/kernel/tip/pkgs/images/boot$ bpftool btf dump file vmlinux-5.7.0-smp-DEV | grep runqueues
>
> [14098] VAR 'runqueues' type_id=13725, linkage=global-alloc
>
> haoluo@haoluo:~/kernel/tip/pkgs/images/boot$ bpftool btf dump file vmlinux-5.7.0-smp-DEV | grep cpu_stopper
>
> [17589] STRUCT 'cpu_stopper' size=72 vlen=5
>
> [17609] VAR 'cpu_stopper' type_id=17589, linkage=global-alloc
>
> Arnaldo, would you please advise on how to upload the pahole patch for review? I am going to polish it a bit and think I can upload it for review.
>
> Thanks,
> Hao
>
> On Tue, May 26, 2020 at 2:04 PM Hao Luo <haoluo@xxxxxxxxxx> wrote:
>>
>> I just did some poking and found the source of the format. TLDR is these letters are of the same semantic of 'nm' output [1]. So we can put the symbols of 'A' in BTF first, as these symbols have absolute addresses in runtime and it's the safest choice to start with, I think.
>>
>> More details. So during linking for vmlinux, the intermediate obj is passed to nm and its output is used by the kallsyms to generate a .S file [2]. That .S file builds a data blob 'kallsyms_names' in vmlinux [3] which is used to generate /proc/kallsyms [4]. The types of the symbols are carried from the output of nm to the kallsyms_names, mostly untouched. The only exception is, if CONFIG_KALLSYMS_ABSOLUTE_PERCPU is configured, percpu symbols are forced to have absolute addresses.
>>
>> [1] https://linux.die.net/man/1/nm
>> [2] https://github.com/torvalds/linux/blob/master/scripts/link-vmlinux.sh#L168
>> [3] https://github.com/torvalds/linux/blob/master/scripts/kallsyms.c#L446
>> [4] https://github.com/torvalds/linux/blob/master/kernel/kallsyms.c#L115
>>
>> On Tue, May 26, 2020 at 11:21 AM Andrii Nakryiko <andrii.nakryiko@xxxxxxxxx> wrote:
>>>
>>> On Tue, May 26, 2020 at 12:58 AM Hao Luo <haoluo@xxxxxxxxxx> wrote:
>>> >
>>> > Hi, Arnaldo and Andrii,
>>> >
>>> > Thanks for taking a look and checking.
>>> >
>>> > On Fri, May 22, 2020 at 7:28 AM Arnaldo Carvalho de Melo <acme@xxxxxxxxxx> wrote:
>>> >>
>>> >> Em Thu, May 21, 2020 at 11:58:47AM -0700, Andrii Nakryiko escreveu:
>>> >> > On Thu, May 21, 2020 at 10:07 AM Alexei Starovoitov <alexei.starovoitov@xxxxxxxxx> wrote:
>>> >> > > 2. teach pahole to store ' A ' annotated kallsyms into vmlinux BTF as
>>> >> > > BTF_KIND_VAR.
>>> >> > > There are ~300 of them, so should be minimal increase in size.
>>> >> >
>>> >> > I thought we'd do that based on section name? Or we will actually
>>> >> > teach pahole to extract kallsyms from vmlinux image?
>>> >>
>>> >> No need to touch kallsyms:
>>> >>
>>> >>   net/core/filter.c
>>> >>
>>> >>   DEFINE_PER_CPU(struct bpf_redirect_info, bpf_redirect_info);
>>> >>
>>> >>   # grep -w bpf_redirect_info /proc/kallsyms
>>> >>   000000000002a160 A bpf_redirect_info
>>> >>   #
>>> >>   # readelf -s ~acme/git/build/v5.7-rc2+/vmlinux | grep bpf_redirect_info
>>> >>   113637: 000000000002a2e0    32 OBJECT  GLOBAL DEFAULT   34 bpf_redirect_info
>>> >>   #
>>> >>
>>> >> Its in the ELF symtab.
>>> >>
>>> >> [root@quaco ~]# grep ' A ' /proc/kallsyms | wc -l
>>> >> 351
>>> >> [root@quaco ~]# readelf -s ~acme/git/build/v5.7-rc2+/vmlinux | grep "OBJECT  GLOBAL" | wc -l
>>> >> 3221
>>> >> [root@quaco ~]#
>>> >>
>>> >> So ' A ' in kallsyms needs some extra info from the symtab in addition
>>> >>
>>> >> to being OBJECT GLOBAL, checking...
>>> >
>>> >
>>> > After playing a bit, I found 'A' symbols in kallsyms include the per_cpu variables (e.g. runqueues and sched_clock_data), either global or local. An example of the global var is 'runqueues' and the example of local one is 'sched_clock_data'.
>>> >
>>> > The OBJECT GLOBAL symbols in vmlinux include the global variables such as runqueues. It also includes those symbols annotated as other capital letters such as 'R' or 'B' in kallsyms. For example, __per_cpu_offset is OBJECT GLOBAL in vmlinux and it's annotated as 'R', implying a global const variable.
>>> >
>>> > I think either the vmlinux approach or the kallsyms approach is good enough. I will continue experimenting while working on step 1.
>>> >
>>>
>>> /proc/kallsyms is available in runtime (if configured, of course),
>>> while vmlinux image might not be available at runtime at all in some
>>> environments. This is one of the reasons for BTF to be exposed in
>>> runtime through /sys/kernel/btf/vmlinux, instead of just keeping it in
>>> vmlinux image. So I think kallsyms approach is better and more
>>> reliable.
>>>
>>> As for 'A', 'R', 'B', etc. Can we please look at source code of
>>> whatever in kernel defines those lettera in ksyms, instead of guessing
>>> based on a subset of symbols? Guessing like this makes me nervous :)
>>>
>>> > Thanks,
>>> > Hao
>>> >
>>> >>
>>> >> > There was step 1.5 (or even 0.5) to see if it's feasible to add not
>>> >> > just per-CPU variables as well.
>>> >>
>>> >> - Arnaldo




[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux