Re: [REGRESSION] module BTF validation failure (Error -22) on

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Jiri Olsa <olsajiri@xxxxxxxxx> writes:
> On Wed, Dec 11, 2024 at 10:10:24PM +0100, Jiri Olsa wrote:
>> On Tue, Dec 10, 2024 at 02:55:01PM +0100, Laura Nao wrote:
>> > Hi Jiri,
>> > 
>> > Thanks for the feedback!
>> > 
>> > On 12/6/24 13:35, Jiri Olsa wrote:
>> > > On Fri, Nov 15, 2024 at 06:17:12PM +0100, Laura Nao wrote:
>> > >> On 11/13/24 10:37, Laura Nao wrote:
>> > >>>
>> > >>> Currently, KernelCI only retains the bzImage, not the vmlinux
>> > >>> binary. The
>> > >>> bzImage can be downloaded from the same link mentioned above by
>> > >>> selecting
>> > >>> 'kernel' from the dropdown menu (modules can also be downloaded the
>> > >>> same
>> > >>> way). I’ll try to replicate the build on my end and share the
>> > >>> vmlinux
>> > >>> with DWARF data stripped for convenience.
>> > >>>
>> > >>
>> > >> I managed to reproduce the issue locally and I've uploaded the
>> > >> vmlinux[1]
>> > >> (stripped of DWARF data) and vmlinux.raw[2] files, as well as one of
>> > >> the
>> > >> modules[3] and its btf data[4] extracted with:
>> > >>
>> > >> bpftool -B vmlinux btf dump file cros_kbd_led_backlight.ko >
>> > >> cros_kbd_led_backlight.ko.raw
>> > >>
>> > >> Looking again at the logs[5], I've noticed the following is reported:
>> > >>
>> > >> [    0.415885] BPF: 	 type_id=115803 offset=177920 size=1152
>> > >> [    0.416029] BPF:
>> > >> [    0.416083] BPF: Invalid offset
>> > >> [    0.416165] BPF:
>> > >>
>> > >> There are two different definitions of rcu_data in '.data..percpu',
>> > >> one
>> > >> is a struct and the other is an integer:
>> > >>
>> > >> type_id=115801 offset=177920 size=1152 (VAR 'rcu_data')
>> > >> type_id=115803 offset=177920 size=1152 (VAR 'rcu_data')
>> > >>
>> > >> [115801] VAR 'rcu_data' type_id=115572, linkage=static
>> > >> [115803] VAR 'rcu_data' type_id=1, linkage=static
>> > >>
>> > >> [115572] STRUCT 'rcu_data' size=1152 vlen=69
>> > >> [1] INT 'long unsigned int' size=8 bits_offset=0 nr_bits=64
>> > >> encoding=(none)
>> > >>
>> > >> I assume that's not expected, correct?
>> > > 
>> > > yes, that seems wrong.. but I can't reproduce with your config
>> > > together with pahole 1.24 .. could you try with latest one?
>> > 
>> > I just tested next-20241210 with the latest pahole version (1.28 from
>> > the master branch[1]), and the issue does not occur with this version
>> > (I can see only one instance of rcu_data in the BTF data, as expected).
>> > 
>> > I can confirm that the same kernel revision still exhibits the issue
>> > with pahole 1.24.
>> > 
>> > If helpful, I can also test versions between 1.24 and 1.28 to identify
>> > which ones work.
>> 
>> I managed to reproduce finally with gcc-12, but had to use pahole 1.25,
>> 1.24 failed with unknown attribute
>> 
>> 	[95096] VAR 'rcu_data' type_id=94868, linkage=static
>> 	[95098] VAR 'rcu_data' type_id=4, linkage=static
>> 	type_id=95096 offset=177088 size=1152 (VAR 'rcu_data')
>> 	type_id=95098 offset=177088 size=1152 (VAR 'rcu_data')
>
> so for me the difference seems to be using gcc-12 and this commit in linux tree:
>   dabddd687c9e percpu: cast percpu pointer in PERCPU_PTR() via unsigned long
>
> which adds extra __pcpu_ptr variable into dwarf, and it has the same
> address as the per cpu variable and that confuses pahole
>
> it ends up with adding per cpu variable twice.. one with real type
> (type_id=94868) and the other with unsigned long type (type_id=4)
>
> however this got fixed in pahole 1.28 commit:
>   47dcb534e253 btf_encoder: Stop indexing symbols for VARs
>
> which filters out __pcpu_ptr variable completely, adding Stephen to the loop

Thanks for sharing this. Your analysis is spot-on, but I can fill in the
details a bit. I just grabbed 6.13-rc2 and built it with gcc 11 and
pahole 1.27, and observed the same issue:

  $ bpftool btf dump file vmlinux | grep "VAR 'rcu_data"
  [4045] VAR 'rcu_data' type_id=3962, linkage=static
  [4047] VAR 'rcu_data' type_id=1, linkage=static
          type_id=4045 offset=196608 size=520 (VAR 'rcu_data')
          type_id=4047 offset=196608 size=520 (VAR 'rcu_data')

In pahole 1.27, the (simplified) process for generating variables for
BTF is:

1. Look through the ELF symbol table, and find all symbols whose
addresses are within the percpu section, and add them to a list.

2. Look through the DWARF: for each tag of type DW_TAG_variable,
determine if the variable is "global". If so, and if the address matches
one of the symbols found in Step 1, continue.

3. Except for one special case, pahole doesn't check whether the DWARF
variable's name matches the symbol name. It simply emits a variable
using the name of the symbol from Step 1, and the type information from
Step 2.

The result of this process, in this case, is:

1. kernel/rcu/tree.c contains a declaration of "rcu_data". This results
in an ELF symbol in vmlinux of the same name. Great!

  $ eu-readelf -s vmlinux | grep '\brcu_data\b'
  12319: 0000000000030000    520 OBJECT  LOCAL  DEFAULT       21 rcu_data


2. A DWARF entry is emitted for "rcu_data" which has a matching location
(DW_AT_location has value DW_OP_addr 0x30000, matching the ELF symbol).
So far so good - pahole emits a BTF variable with the expected type.

  $ llvm-dwarfdump --name=rcu_data
  ...
  0x01af03f1: DW_TAG_variable
                DW_AT_name        ("rcu_data")
                DW_AT_decl_file   ("/home/stepbren/repos/linux-upstream/kernel/rcu/tree.c")
                DW_AT_decl_line   (80)
                DW_AT_decl_column (8)
                DW_AT_type        (0x01aefb38 "rcu_data")
                DW_AT_alignment   (0x40)
                DW_AT_location    (DW_OP_addr 0x30000)

3. In kernel/rcu/tree.c, we also have the following declaration at line
5227 which uses per_cpu_ptr() on &rcu_data:

5222 void rcutree_migrate_callbacks(int cpu)
5223 {
5224 	unsigned long flags;
5225 	struct rcu_data *my_rdp;
5226 	struct rcu_node *my_rnp;
5227 	struct rcu_data *rdp = per_cpu_ptr(&rcu_data, cpu);
                               ^^^^^^^^^^^

With the new changes in dabddd687c9e ("percpu: cast percpu pointer in
PERCPU_PTR() via unsigned long"), this expands to a lexical block which
contains a variable named "__pcpu_ptr", of type unsigned long. The
compiler emits the following DW_TAG_variable in the DWARF:

0x01b05d20:         DW_TAG_variable
                      DW_AT_name        ("__pcpu_ptr")
                      DW_AT_decl_file   ("/home/stepbren/repos/linux-upstream/kernel/rcu/tree.c")
                      DW_AT_decl_line   (5227)
                      DW_AT_decl_column (25)
                      DW_AT_type        (0x01adb52e "long unsigned int")
                      DW_AT_location    (DW_OP_addr 0x30000, DW_OP_stack_value)

Since the DW_AT_location has a DW_OP_addr - pahole understands this to
mean that the variable is located in global memory, and thus has
VSCOPE_GLOBAL. But of course, the actual "scope" of this variable is not
global, it is limited to the lexical block, which is completely hidden
away by the macro. But pahole 1.27 does not consider this, and since the
address matches the "rcu_data" symbol, it emits a variable of type "long
unsigned int" under the name "rcu_data" -- despite the fact that the
DWARF info has a name of "__pcpu_ptr".

The changes I made in 1.28 address this (unintentionally) by:

1. Requiring global variables be both "in the global scope" (i.e. in the
CU-level, rather than any function or other lexical block.
2. Requiring global variables have global memory (some of them could be
register variables, despite having global scope -- e.g.
current_stack_pointer).
3. No longer using the ELF symbol table, and instead using the DWARF
names for variables.

With #1, we would filter this variable. And with #3, even if the
variable were not filtered, we would output (a bunch of) variables with
the correct __pcpu_ptr variable name, which is unhelpful but at least
helps us understand where these things come from.

Rebuilding with GCC 14, we can see that the "__pcpu_ptr" variable no
longer has a DW_AT_location:

0x01afa82f:         DW_TAG_variable
                      DW_AT_name        ("__pcpu_ptr")
                      DW_AT_decl_file   ("/home/stepbren/repos/linux-upstream/kernel/rcu/tree.c")
                      DW_AT_decl_line   (5227)
                      DW_AT_decl_column (25)
                      DW_AT_type        (0x01ad0267 "long unsigned int")

This is the reason that pahole 1.27 now recognizes it as
VSCOPE_OPTIMIZED. Without a memory location pahole can't do anything to
match it against the "rcu_data" variable so nothing is emitted, and we
don't get the issue.

I'm not sure if this adds at all to the discussion, since the overall
answer is the same, an upgrade of pahole and/or gcc. (Pahole would be
recommended; GCC just changed the generated DWARF and I could imagine
other situations popping up elsewhere).


Thanks,
Stephen

> with gcc-14 the __pcpu_ptr variable has VSCOPE_OPTIMIZED scope, so it won't
> get into btf even without above pahole fix
>
> I suggest gcc/pahole upgrade ;-)
>
> thanks,
> jirka





[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux