On June 12, 2020 7:16:46 PM GMT-03:00, Hao Luo <haoluo@xxxxxxxxxx> wrote: >On Fri, Jun 12, 2020 at 3:01 PM Andrii Nakryiko ><andrii.nakryiko@xxxxxxxxx> >wrote: > >> On Fri, Jun 12, 2020 at 2:54 PM Hao Luo <haoluo@xxxxxxxxxx> wrote: >> > >> > Further updates: >> > >> > Previously the in-kernel bpf verifier rejected the enhanced vmlinux >> because I set the "size" field of datasec to 0, which is obviously >> forbidden by the bpf kernel verifier. After I adjusted it to the last >> var_secinfo's offset + size, it got loaded successfully. In addition, >there >> are a few more sanity checks in the verifier on DATASEC and VAR's >meta >> format (e.g. type size, variable name, etc.), which I am going to >port into >> btf_encoder to be 100% safe. With these checks, the "(anon)" vars >seen by >> Arnaldo should be gone. I am currently running through a set of >tp_btf, >> fentry and fexit programs on the enhanced vmlinux and they are >looking good >> so far. I hope to upload these changes in the next iteration next >week. >> >> Do you know where those (anon) vars are coming from? >> > >Nah, I am curious too but can't reproduce on my side. It would be >helpful >if Arnaldo could enable the debug msg I put in the patch and let me >know >which cu generates those (anon) vars. Unsure if I'll be able to do it tomorrow, I'll try. > > >> >> > >> > Thanks, >> > Hao >> > >> > On Thu, Jun 11, 2020 at 2:41 PM Hao Luo <haoluo@xxxxxxxxxx> wrote: >> >> >> >> I am finally able to get a tp_btf program compiled and tested >against >> the generated vmlinux. Unfortunately, the bpf verifier seemed to have >> rejected the vmlinux. I got an error message "in-kernel BTF is >malformed". >> I have to work on the bpf verifier first to make it compatible with >the >> newly added VARs. >> >> >> >> Hao >> >> >> >> On Thu, Jun 11, 2020 at 10:59 AM Hao Luo <haoluo@xxxxxxxxxx> >wrote: >> >>> >> >>> Hi Arnaldo, >> >>> >> >>> Sorry for the late reply, I was tied to other stuff on my other >work >> in the last couple of days. I am going to take a closer look today >and >> tomorrow. It seems I had difficulty reproducing in my local >environment, >> maybe due to differences in compiler flags or kconfigs. Could you >help me >> by enabling verbose to see which CU generated those symbols? In the >patch I >> have added debug messages reporting the current CU and symbols that >got >> encoded. >> >>> >> >>> Thanks, >> >>> Hao >> >>> >> >>> >> >>> On Tue, Jun 9, 2020 at 7:58 AM Arnaldo Carvalho de Melo < >> acme@xxxxxxxxxx> wrote: >> >>>> >> >>>> Em Tue, Jun 09, 2020 at 11:29:40AM -0300, Arnaldo Carvalho de >Melo >> escreveu: >> >>>> > Em Mon, Jun 08, 2020 at 10:34:03AM -0700, Hao Luo escreveu: >> >>>> > > On SMP systems, the global percpu variables are placed in a >> special >> >>>> > > '.data..percpu' section, which is stored in a segment whose >> initial >> >>>> > > address is set to 0, the addresses of per-CPU variables are >> relative >> >>>> > > positive addresses [1]. >> >>>> > > >> >>>> > > This patch extracts these variables from vmlinux and places >them >> with >> >>>> > > their type information in BTF. More specifically, when BTF >is >> encoded, >> >>>> > > we find the index of the '.data..percpu' section and then >traverse >> >>>> > > the symbol table to find those global objects which are in >this >> section. >> >>>> > > For each of these objects, we push a BTF_KIND_VAR into the >types >> buffer, >> >>>> > > and a BTF_VAR_SECINFO into another buffer, percpu_secinfo. >When >> all the >> >>>> > > CUs have finished processing, we push a BTF_KIND_DATASEC >into the >> >>>> > > btfe->types buffer, followed by the percpu_secinfo's >content. >> >>>> > > >> >>>> > > In a v5.7-rc7 linux kernel, I was able to extract 291 such >> variables. >> >>>> > > The build time overhead is small and the space overhead is >also >> small. >> >>>> > >> >>>> > Looks good, I'm doing some testing on it now, Andrii, can you >> provide an >> >>>> > Acked-by or Reviewed-by? >> >>>> >> >>>> So, I see these (anon) variables, what are these? for an 5.7 >vmlinux: >> >>>> >> >>>> [acme@five pahole]$ bpftool btf dump file vmlinux | grep -w VAR >| >> tail >> >>>> [67381] VAR '(anon)' type_id=67175, linkage=static >> >>>> [67495] VAR 'rt_cache_stat' type_id=67417, linkage=static >> >>>> [67496] VAR 'rt_uncached_list' type_id=67416, linkage=static >> >>>> [67857] VAR 'tcp_md5sig_pool' type_id=67788, linkage=static >> >>>> [67993] VAR 'tsq_tasklet' type_id=67927, linkage=static >> >>>> [69524] VAR 'xfrm_trans_tasklet' type_id=69502, linkage=static >> >>>> [70055] VAR 'rt6_uncached_list' type_id=67416, linkage=static >> >>>> [70609] VAR '(anon)' type_id=1713, linkage=static >> >>>> [70634] VAR 'hmac_ring' type_id=1909, linkage=static >> >>>> [71591] VAR 'xskmap_flush_list' type_id=85, linkage=static >> >>>> [acme@five pahole]$ >> >>>> >> >>>> [acme@five pahole]$ bpftool btf dump file vmlinux | grep -w 1713 >> >>>> [1713] FUNC 'memset' type_id=1712 >> >>>> [7235] VAR '(anon)' type_id=1713, linkage=static >> >>>> [7236] VAR '(anon)' type_id=1713, linkage=static >> >>>> [8832] VAR '(anon)' type_id=1713, linkage=static >> >>>> [14346] VAR '(anon)' type_id=1713, linkage=static >> >>>> [14347] VAR '(anon)' type_id=1713, linkage=static >> >>>> [14348] VAR '(anon)' type_id=1713, linkage=static >> >>>> [14349] VAR '(anon)' type_id=1713, linkage=static >> >>>> [14350] VAR '(anon)' type_id=1713, linkage=static >> >>>> [14351] VAR '(anon)' type_id=1713, linkage=static >> >>>> [14352] VAR '(anon)' type_id=1713, linkage=static >> >>>> [18903] VAR '(anon)' type_id=1713, linkage=static >> >>>> [18904] VAR '(anon)' type_id=1713, linkage=static >> >>>> [23180] VAR '(anon)' type_id=1713, linkage=static >> >>>> [44605] VAR '(anon)' type_id=1713, linkage=static >> >>>> [60638] VAR '(anon)' type_id=1713, linkage=static >> >>>> [60639] VAR '(anon)' type_id=1713, linkage=static >> >>>> [63869] VAR '(anon)' type_id=1713, linkage=static >> >>>> [70609] VAR '(anon)' type_id=1713, linkage=static >> >>>> [acme@five pahole]$ >> >>>> >> >>>> [acme@five pahole]$ bpftool btf dump file vmlinux | grep -m1 -w >1713 >> -B9 >> >>>> [1710] FUNC_PROTO '(anon)' ret_type_id=96 vlen=3 >> >>>> 'p' type_id=96 >> >>>> 'q' type_id=97 >> >>>> 'size' type_id=49 >> >>>> [1711] FUNC 'memcpy' type_id=1710 >> >>>> [1712] FUNC_PROTO '(anon)' ret_type_id=96 vlen=3 >> >>>> 'p' type_id=96 >> >>>> 'c' type_id=22 >> >>>> 'size' type_id=49 >> >>>> [1713] FUNC 'memset' type_id=1712 >> >>>> [acme@five pahole]$ >> >>>> >> >>>> [acme@five pahole]$ bpftool btf dump file vmlinux | grep -m10 -w >> 7235 -B5 -A5 >> >>>> 'prec' type_id=22 >> >>>> [7232] FUNC 'arch_show_interrupts' type_id=7231 >> >>>> [7233] FUNC_PROTO '(anon)' ret_type_id=0 vlen=1 >> >>>> 'irq' type_id=10 >> >>>> [7234] FUNC 'ack_bad_irq' type_id=7233 >> >>>> [7235] VAR '(anon)' type_id=1713, linkage=static >> >>>> [7236] VAR '(anon)' type_id=1713, linkage=static >> >>>> [7237] ARRAY '(anon)' type_id=233 index_type_id=1 nr_elems=4 >> >>>> [7238] FUNC 'irq_init_percpu_irqstack' type_id=6732 >> >>>> [7239] VAR 'irq_stack_backing_store' type_id=412, >linkage=global-alloc >> >>>> [7240] STRUCT 'estack_pages' size=8 vlen=3 >> >>>> -- >> >>>> type_id=6739 offset=98096 size=16 >> >>>> type_id=6737 offset=98112 size=16 >> >>>> type_id=6736 offset=98128 size=16 >> >>>> type_id=6764 offset=98144 size=16 >> >>>> type_id=6763 offset=98160 size=16 >> >>>> type_id=7235 offset=98176 size=0 >> >>>> type_id=7330 offset=98192 size=4 >> >>>> type_id=7329 offset=98200 size=8 >> >>>> type_id=7328 offset=98208 size=4 >> >>>> type_id=7325 offset=98216 size=8 >> >>>> type_id=7327 offset=98224 size=1 >> >>>> [acme@five pahole]$ >> >>>> >> >>>> [acme@five pahole]$ readelf -wi vmlinux | grep -m 2 >DW_AT_producer >> >>>> <1c> DW_AT_producer : (indirect string, offset: 0x49): >GNU >> AS 2.32 >> >>>> <2e> DW_AT_producer : (indirect string, offset: >0x1358): GNU >> C89 9.3.1 20200408 (Red Hat 9.3.1-2) -mno-sse -mno-mmx -mno-sse2 >-mno-3dnow >> -mno-avx -m64 -mno-80387 -mno-fp-ret-in-387 >-mpreferred-stack-boundary=3 >> -mskip-rax-setup -mtune=generic -mno-red-zone -mcmodel=kernel >> -mindirect-branch=thunk-extern -mindirect-branch-register >-mrecord-mcount >> -mfentry -march=x86-64 -g -O2 -std=gnu90 -fno-strict-aliasing >-fno-common >> -fshort-wchar -fno-PIE -falign-jumps=1 -falign-loops=1 >> -fno-asynchronous-unwind-tables -fno-jump-tables >> -fno-delete-null-pointer-checks -fstack-protector-strong >> -fno-var-tracking-assignments -fno-strict-overflow >-fno-merge-all-constants >> -fmerge-constants -fstack-check=no -fconserve-stack >-fcf-protection=none >> --param allow-store-data-races=0 >> >>>> [acme@five pahole]$ >> >>>> >> >>>> > Thanks, >> >>>> > >> >>>> > - Arnaldo >> >>>> > >> >>>> > > Testing: >> >>>> > > >> >>>> > > Before: >> >>>> > > $ readelf -SW vmlinux | grep BTF >> >>>> > > [25] .BTF PROGBITS ffffffff821a905c >13a905c >> 2d2bf8 00 A 0 0 1 >> >>>> > > >> >>>> > > After: >> >>>> > > $ pahole -J vmlinux >> >>>> > > $ readelf -SW vmlinux | grep BTF >> >>>> > > [25] .BTF PROGBITS ffffffff821a905c >13a905c >> 2d5bca 00 A 0 0 1 >> >>>> > > >> >>>> > > Common percpu vars can be found in the BTF section. >> >>>> > > >> >>>> > > $ bpftool btf dump file vmlinux | grep runqueues >> >>>> > > [14098] VAR 'runqueues' type_id=13725, linkage=global-alloc >> >>>> > > >> >>>> > > $ bpftool btf dump file vmlinux | grep 'cpu_stopper' >> >>>> > > [17592] STRUCT 'cpu_stopper' size=72 vlen=5 >> >>>> > > [17612] VAR 'cpu_stopper' type_id=17592, linkage=static >> >>>> > > >> >>>> > > $ bpftool btf dump file vmlinux | grep ' DATASEC ' >> >>>> > > [63652] DATASEC '.data..percpu' size=0 vlen=294 >> >>>> > > >> >>>> > > References: >> >>>> > > [1] https://lwn.net/Articles/531148/ >> >>>> > > Signed-off-by: Hao Luo <haoluo@xxxxxxxxxx> >> >>>> > > --- >> >>>> > > Changelog since v2: >> >>>> > > - Move finding percpu_shndx and extracting symtab into btfe >> creation, >> >>>> > > so we don't have to allocate a new symtab for each CU. >> >>>> > > - More debug msg by logging the vars encoded in 'verbose' >mode. We >> >>>> > > probably don't want to log the symbols that are _not_ >encoded, >> >>>> > > since that would be too verbose. >> >>>> > > - Calculate var offsets using 'addr - shdr.sh_addr', so it >could >> be >> >>>> > > generalized to other sections in future. >> >>>> > > - Filter out the symbols that are not STT_OBJECT. >> >>>> > > - Sort var_secinfos in the DATASEC by their offsets. >> >>>> > > - Free 'persec_secinfo' buffer and 'symtab' in btfe >deletion. >> >>>> > > - Replace the string ".data..percpu" with a constant >> PERCPU_SECTION. >> >>>> > > >> >>>> > > Changelog since v1: >> >>>> > > - Add a ".data..percpu" DATASEC that encodes the found VARs. >> >>>> > > - Use percpu section's shndx to find the symbols that are >percpu >> variables. >> >>>> > > - Use the correct type to set VAR's linkage. >> >>>> > > >> >>>> > > btf_encoder.c | 119 >> ++++++++++++++++++++++++++++++++++++++++++++++++ >> >>>> > > dwarves.c | 6 +++ >> >>>> > > dwarves.h | 2 + >> >>>> > > libbtf.c | 123 >> ++++++++++++++++++++++++++++++++++++++++++++++++++ >> >>>> > > libbtf.h | 12 +++++ >> >>>> > > pahole.c | 1 + >> >>>> > > 6 files changed, 263 insertions(+) >> >>>> > > >> >> [...] >> -- Sent from my Android device with K-9 Mail. Please excuse my brevity.