Jiri Olsa <olsajiri@xxxxxxxxx> writes: > On Wed, Dec 11, 2024 at 10:10:24PM +0100, Jiri Olsa wrote: >> On Tue, Dec 10, 2024 at 02:55:01PM +0100, Laura Nao wrote: >> > Hi Jiri, >> > >> > Thanks for the feedback! >> > >> > On 12/6/24 13:35, Jiri Olsa wrote: >> > > On Fri, Nov 15, 2024 at 06:17:12PM +0100, Laura Nao wrote: >> > >> On 11/13/24 10:37, Laura Nao wrote: >> > >>> >> > >>> Currently, KernelCI only retains the bzImage, not the vmlinux >> > >>> binary. The >> > >>> bzImage can be downloaded from the same link mentioned above by >> > >>> selecting >> > >>> 'kernel' from the dropdown menu (modules can also be downloaded the >> > >>> same >> > >>> way). I’ll try to replicate the build on my end and share the >> > >>> vmlinux >> > >>> with DWARF data stripped for convenience. >> > >>> >> > >> >> > >> I managed to reproduce the issue locally and I've uploaded the >> > >> vmlinux[1] >> > >> (stripped of DWARF data) and vmlinux.raw[2] files, as well as one of >> > >> the >> > >> modules[3] and its btf data[4] extracted with: >> > >> >> > >> bpftool -B vmlinux btf dump file cros_kbd_led_backlight.ko > >> > >> cros_kbd_led_backlight.ko.raw >> > >> >> > >> Looking again at the logs[5], I've noticed the following is reported: >> > >> >> > >> [ 0.415885] BPF: type_id=115803 offset=177920 size=1152 >> > >> [ 0.416029] BPF: >> > >> [ 0.416083] BPF: Invalid offset >> > >> [ 0.416165] BPF: >> > >> >> > >> There are two different definitions of rcu_data in '.data..percpu', >> > >> one >> > >> is a struct and the other is an integer: >> > >> >> > >> type_id=115801 offset=177920 size=1152 (VAR 'rcu_data') >> > >> type_id=115803 offset=177920 size=1152 (VAR 'rcu_data') >> > >> >> > >> [115801] VAR 'rcu_data' type_id=115572, linkage=static >> > >> [115803] VAR 'rcu_data' type_id=1, linkage=static >> > >> >> > >> [115572] STRUCT 'rcu_data' size=1152 vlen=69 >> > >> [1] INT 'long unsigned int' size=8 bits_offset=0 nr_bits=64 >> > >> encoding=(none) >> > >> >> > >> I assume that's not expected, correct? >> > > >> > > yes, that seems wrong.. but I can't reproduce with your config >> > > together with pahole 1.24 .. could you try with latest one? >> > >> > I just tested next-20241210 with the latest pahole version (1.28 from >> > the master branch[1]), and the issue does not occur with this version >> > (I can see only one instance of rcu_data in the BTF data, as expected). >> > >> > I can confirm that the same kernel revision still exhibits the issue >> > with pahole 1.24. >> > >> > If helpful, I can also test versions between 1.24 and 1.28 to identify >> > which ones work. >> >> I managed to reproduce finally with gcc-12, but had to use pahole 1.25, >> 1.24 failed with unknown attribute >> >> [95096] VAR 'rcu_data' type_id=94868, linkage=static >> [95098] VAR 'rcu_data' type_id=4, linkage=static >> type_id=95096 offset=177088 size=1152 (VAR 'rcu_data') >> type_id=95098 offset=177088 size=1152 (VAR 'rcu_data') > > so for me the difference seems to be using gcc-12 and this commit in linux tree: > dabddd687c9e percpu: cast percpu pointer in PERCPU_PTR() via unsigned long > > which adds extra __pcpu_ptr variable into dwarf, and it has the same > address as the per cpu variable and that confuses pahole > > it ends up with adding per cpu variable twice.. one with real type > (type_id=94868) and the other with unsigned long type (type_id=4) > > however this got fixed in pahole 1.28 commit: > 47dcb534e253 btf_encoder: Stop indexing symbols for VARs > > which filters out __pcpu_ptr variable completely, adding Stephen to the loop Thanks for sharing this. Your analysis is spot-on, but I can fill in the details a bit. I just grabbed 6.13-rc2 and built it with gcc 11 and pahole 1.27, and observed the same issue: $ bpftool btf dump file vmlinux | grep "VAR 'rcu_data" [4045] VAR 'rcu_data' type_id=3962, linkage=static [4047] VAR 'rcu_data' type_id=1, linkage=static type_id=4045 offset=196608 size=520 (VAR 'rcu_data') type_id=4047 offset=196608 size=520 (VAR 'rcu_data') In pahole 1.27, the (simplified) process for generating variables for BTF is: 1. Look through the ELF symbol table, and find all symbols whose addresses are within the percpu section, and add them to a list. 2. Look through the DWARF: for each tag of type DW_TAG_variable, determine if the variable is "global". If so, and if the address matches one of the symbols found in Step 1, continue. 3. Except for one special case, pahole doesn't check whether the DWARF variable's name matches the symbol name. It simply emits a variable using the name of the symbol from Step 1, and the type information from Step 2. The result of this process, in this case, is: 1. kernel/rcu/tree.c contains a declaration of "rcu_data". This results in an ELF symbol in vmlinux of the same name. Great! $ eu-readelf -s vmlinux | grep '\brcu_data\b' 12319: 0000000000030000 520 OBJECT LOCAL DEFAULT 21 rcu_data 2. A DWARF entry is emitted for "rcu_data" which has a matching location (DW_AT_location has value DW_OP_addr 0x30000, matching the ELF symbol). So far so good - pahole emits a BTF variable with the expected type. $ llvm-dwarfdump --name=rcu_data ... 0x01af03f1: DW_TAG_variable DW_AT_name ("rcu_data") DW_AT_decl_file ("/home/stepbren/repos/linux-upstream/kernel/rcu/tree.c") DW_AT_decl_line (80) DW_AT_decl_column (8) DW_AT_type (0x01aefb38 "rcu_data") DW_AT_alignment (0x40) DW_AT_location (DW_OP_addr 0x30000) 3. In kernel/rcu/tree.c, we also have the following declaration at line 5227 which uses per_cpu_ptr() on &rcu_data: 5222 void rcutree_migrate_callbacks(int cpu) 5223 { 5224 unsigned long flags; 5225 struct rcu_data *my_rdp; 5226 struct rcu_node *my_rnp; 5227 struct rcu_data *rdp = per_cpu_ptr(&rcu_data, cpu); ^^^^^^^^^^^ With the new changes in dabddd687c9e ("percpu: cast percpu pointer in PERCPU_PTR() via unsigned long"), this expands to a lexical block which contains a variable named "__pcpu_ptr", of type unsigned long. The compiler emits the following DW_TAG_variable in the DWARF: 0x01b05d20: DW_TAG_variable DW_AT_name ("__pcpu_ptr") DW_AT_decl_file ("/home/stepbren/repos/linux-upstream/kernel/rcu/tree.c") DW_AT_decl_line (5227) DW_AT_decl_column (25) DW_AT_type (0x01adb52e "long unsigned int") DW_AT_location (DW_OP_addr 0x30000, DW_OP_stack_value) Since the DW_AT_location has a DW_OP_addr - pahole understands this to mean that the variable is located in global memory, and thus has VSCOPE_GLOBAL. But of course, the actual "scope" of this variable is not global, it is limited to the lexical block, which is completely hidden away by the macro. But pahole 1.27 does not consider this, and since the address matches the "rcu_data" symbol, it emits a variable of type "long unsigned int" under the name "rcu_data" -- despite the fact that the DWARF info has a name of "__pcpu_ptr". The changes I made in 1.28 address this (unintentionally) by: 1. Requiring global variables be both "in the global scope" (i.e. in the CU-level, rather than any function or other lexical block. 2. Requiring global variables have global memory (some of them could be register variables, despite having global scope -- e.g. current_stack_pointer). 3. No longer using the ELF symbol table, and instead using the DWARF names for variables. With #1, we would filter this variable. And with #3, even if the variable were not filtered, we would output (a bunch of) variables with the correct __pcpu_ptr variable name, which is unhelpful but at least helps us understand where these things come from. Rebuilding with GCC 14, we can see that the "__pcpu_ptr" variable no longer has a DW_AT_location: 0x01afa82f: DW_TAG_variable DW_AT_name ("__pcpu_ptr") DW_AT_decl_file ("/home/stepbren/repos/linux-upstream/kernel/rcu/tree.c") DW_AT_decl_line (5227) DW_AT_decl_column (25) DW_AT_type (0x01ad0267 "long unsigned int") This is the reason that pahole 1.27 now recognizes it as VSCOPE_OPTIMIZED. Without a memory location pahole can't do anything to match it against the "rcu_data" variable so nothing is emitted, and we don't get the issue. I'm not sure if this adds at all to the discussion, since the overall answer is the same, an upgrade of pahole and/or gcc. (Pahole would be recommended; GCC just changed the generated DWARF and I could imagine other situations popping up elsewhere). Thanks, Stephen > with gcc-14 the __pcpu_ptr variable has VSCOPE_OPTIMIZED scope, so it won't > get into btf even without above pahole fix > > I suggest gcc/pahole upgrade ;-) > > thanks, > jirka