On 06/02/2024 14:23, Arnaldo Carvalho de Melo wrote: > On Tue, Feb 06, 2024 at 10:42:44AM -0300, Arnaldo Carvalho de Melo wrote: >> On Tue, Feb 06, 2024 at 01:02:26PM +0100, Sebastian Andrzej Siewior wrote: >>> with linux kernel v6.8-rc2-RT and the following file as testcase >>> >>> ----->8-------- >>> >>> #include <linux/spinlock.h> >>> #include <linux/local_lock.h> >>> >>> struct per_cpu_struct { >>> local_lock_t lock; >>> unsigned int something; >>> }; >>> >>> static DEFINE_PER_CPU(struct per_cpu_struct, per_cpu_struct) = { >>> .lock = INIT_LOCAL_LOCK(lock), >>> }; >>> >>> DEFINE_GUARD(ll_lock, local_lock_t __percpu*, >>> local_lock(_T), >>> local_unlock(_T)) >>> >>> void function(void); >>> void function(void) >>> { >>> struct per_cpu_struct *pcs = this_cpu_ptr(&per_cpu_struct); >>> >>> guard(ll_lock)(&per_cpu_struct.lock); >>> pcs->something++; >>> } >>> >>> -----8<-------- >>> >>> compiling and running pahole afterwards: >>> | make kernel/pahole-tc.o && pahole -J --btf_gen_floats -j --lang_exclude=rust kernel/pahole-tc.o >>> | CC kernel/pahole-tc.o >>> | error: found variable 'per_cpu_struct' in CU 'kernel/pahole-tc.c' that has void type >>> >>> This doesn't look good. If I swap the order of "lock" and "something" >>> within per_cpu_struct then it goes away. >>> The dwarf/die object it complains about has only DW_AT_abstract_origin >>> and DW_AT_location set. > Thanks for reporting and providing all this data! The eventual generated BTF might give us some clues, running $ pahole -J pahole-tc-good-swap.o $ pahole -J pahole-tc-bad.o ...and dumping raw BTF we see that in the bad case, we've got 3 variables of different types called "per_cpu_ptr": $ bpftool btf dump file pahole-tc-bad.o |grep per_cpu_ptr [88] STRUCT 'per_cpu_struct' size=136 vlen=2 [94] VAR 'per_cpu_struct' type_id=88, linkage=static [95] VAR 'per_cpu_struct' type_id=89, linkage=static [96] VAR 'per_cpu_struct' type_id=87, linkage=static type_id=94 offset=0 size=136 (VAR 'per_cpu_pe struct') type_id=95 offset=0 size=136 (VAR 'per_cpu_struct') type_id=96 offset=0 size=136 (VAR 'per_cpu_struct') The first variable [94] is of type "struct per_cpu_struct", the second is [89] TYPEDEF 'class_ll_lock_t' type_id=87 ...and the third is type id 87: [86] TYPEDEF 'local_lock_t' type_id=73 [87] PTR '(anon)' type_id=86 which is a pointer to a 'typedef local_local_t'. Contrast this to the good case, where only 1 is seen: $ bpftool btf dump file pahole-tc-good-swap.o format raw |grep per_cpu [88] STRUCT 'per_cpu_struct' size=136 vlen=2 [94] VAR 'per_cpu_struct' type_id=88, linkage=static type_id=94 offset=0 size=136 (VAR 'per_cpu_struct') So I think as you suspected we oddly seem to be conflating the per_cpu_struct variable with its first member somehow, and worse doing it twice; once for the typedef class_ll_lock_t and again for the ptr to local_lock_t. The DWARF looks right to me: 0x00000764: DW_TAG_variable DW_AT_name ("per_cpu_struct") DW_AT_decl_file ("/home/bigeasy/linux-stable-intree/kernel/pahole-tc.c") DW_AT_decl_line (9) DW_AT_decl_column (0x08) DW_AT_type (0x0000073c "per_cpu_struct") DW_AT_location (DW_OP_addr 0x0) Putting additional debugging in the BTF encoder: diff --git a/btf_encoder.c b/btf_encoder.c index fd04008..73ee6f7 100644 --- a/btf_encoder.c +++ b/btf_encoder.c @@ -1520,6 +1520,9 @@ static int btf_encoder__encode_cu_variables(struct btf_encoder *encoder) uint64_t addr; int id; + if (encoder->verbose) + printf("processing var '%s' at pos %p, tag 0x%x\n", variable__name(var), pos, pos->tag); + if (var->declaration && !var->spec) continue; ...we see search cu 'kernel/pahole-tc.c' for percpu global variables. processing var 'current_stack_pointer' at pos 0xa14be0, tag 0x34 processing var 'this_cpu_off' at pos 0xa14cb0, tag 0x34 processing var 'per_cpu_struct' at pos 0xa201f0, tag 0x34 Variable 'per_cpu_struct' from CU 'kernel/pahole-tc.c' at address 0x0 encoded [96] VAR per_cpu_struct type=90 linkage=0 processing var 'pcs' at pos 0xa20c20, tag 0x34 processing var '__UNIQUE_ID_guard44' at pos 0xa20cf0, tag 0x34 Variable 'per_cpu_struct' from CU 'kernel/pahole-tc.c' at address 0x0 encoded [97] VAR per_cpu_struct type=91 linkage=0 processing var '__vpp_verify' at pos 0xa20e60, tag 0x34 processing var 'tcp_ptr__' at pos 0xa20fd0, tag 0x34 processing var '(null)' at pos 0xa21130, tag 0x34 error: found variable 'per_cpu_struct' in CU 'kernel/pahole-tc.c' that has void type Looking at the above, we see it's the __UNIQUE_ID prefixed variables that cause the trouble; for each of these we wind up encoding an instance of the per_cpu_struct variable. We have a special case meant to handle this, but it is triggering for the __UNIQUE_ID variables: /* A lot of "special" DWARF variables (e.g, __UNIQUE_ID___xxx) * have addr == 0, which is the same as, say, valid * fixed_percpu_data per-CPU variable. To distinguish between * them, additionally compare DWARF and ELF symbol names. If * DWARF doesn't provide proper name, pessimistically assume * bad variable. * * Examples of such special variables are: * * 1. __ADDRESSABLE(sym), which are forcely emitted as symbols. * 2. __UNIQUE_ID(prefix), which are introduced to generate unique ids. * 3. __exitcall(fn), functions which are labeled as exit calls. * * This is relevant only for vmlinux image, as for kernel * modules per-CPU data section has non-zero offset so all * per-CPU symbols have non-zero values. */ if (var->ip.addr == 0) { if (!dwarf_name || strcmp(dwarf_name, name)) continue; } So I suspect the problem is if (!btf_encoder__percpu_var_exists(encoder, addr, &size, &name) ...which uses the associated address to look up the name, and then goes on to use that name to encode. The problem is the conditional check above; I can't see why it needs to be guarded here. We never want to do an address-based lookup for a variable and not have it be the same variable name as we have from DWARF - which is the one we're trying to encode, right? The following fixes the issue for me, can you try this out if you get a chance? I might be missing something about that check so other folks please do weigh in if something looks broken. Thanks! diff --git a/btf_encoder.c b/btf_encoder.c index fd04008..aecb311 100644 --- a/btf_encoder.c +++ b/btf_encoder.c @@ -1559,10 +1559,8 @@ static int btf_encoder__encode_cu_variables(struct btf_encoder *encoder) * modules per-CPU data section has non-zero offset so all * per-CPU symbols have non-zero values. */ - if (var->ip.addr == 0) { - if (!dwarf_name || strcmp(dwarf_name, name)) - continue; - } + if (!dwarf_name || strcmp(dwarf_name, name)) + continue; if (var->spec) var = var->spec; > Are you sure? > > What I see: > > ⬢[acme@toolbox pahole]$ cp sebastian-rt-void-variable-btf/pahole-tc/pahole-tc-bad.o . > ⬢[acme@toolbox pahole]$ pahole --btf_encode pahole-tc-bad.o > error: found variable 'per_cpu_struct' in CU 'kernel/pahole-tc.c' that has void type > ⬢[acme@toolbox pahole]$ > > ⬢[acme@toolbox pahole]$ readelf -wi sebastian-rt-void-variable-btf/pahole-tc/pahole-tc-bad.o | grep DW_TAG_variable -A5 | grep -w per_cpu_struct -B1 -A4 > <1><764>: Abbrev Number: 29 (DW_TAG_variable) > <765> DW_AT_name : (indirect string, offset: 0x184): per_cpu_struct > <769> DW_AT_decl_file : 1 > <769> DW_AT_decl_line : 9 > <76a> DW_AT_decl_column : 8 > <76b> DW_AT_type : <0x73c> > ⬢[acme@toolbox pahole]$ > > ⬢[acme@toolbox pahole]$ readelf -wi sebastian-rt-void-variable-btf/pahole-tc/pahole-tc-bad.o | grep '<73c>.*DW_TAG_structure_type' -A7 > <1><73c>: Abbrev Number: 4 (DW_TAG_structure_type) > <73d> DW_AT_name : (indirect string, offset: 0x184): per_cpu_struct > <741> DW_AT_byte_size : 136 > <742> DW_AT_decl_file : 1 > <743> DW_AT_decl_line : 4 > <744> DW_AT_decl_column : 8 > <745> DW_AT_sibling : <0x764> > <2><749>: Abbrev Number: 1 (DW_TAG_member) > ⬢[acme@toolbox pahole]$ > > But when using verbose mode: > > [95] FUNC function type_id=94 > search cu 'kernel/pahole-tc.c' for percpu global variables. > Variable 'per_cpu_struct' from CU 'kernel/pahole-tc.c' at address 0x0 encoded > [96] VAR per_cpu_struct type=90 linkage=0 > Variable 'per_cpu_struct' from CU 'kernel/pahole-tc.c' at address 0x0 encoded > [97] VAR per_cpu_struct type=91 linkage=0 > error: found variable 'per_cpu_struct' in CU 'kernel/pahole-tc.c' that has void type > [98] DATASEC .data..percpu size=136 vlen=2 > type=96 offset=0 size=136 > type=97 offset=0 size=136 > ⬢[acme@toolbox pahole]$ pahole -V --btf_encode pahole-tc-bad.o > > It ends up considering two variables with that name? > > ⬢[acme@toolbox pahole]$ readelf -wi sebastian-rt-void-variable-btf/pahole-tc/pahole-tc-bad.o | grep per_cpu_struct -B1 -A5 > <1><73c>: Abbrev Number: 4 (DW_TAG_structure_type) > <73d> DW_AT_name : (indirect string, offset: 0x184): per_cpu_struct > <741> DW_AT_byte_size : 136 > <742> DW_AT_decl_file : 1 > <743> DW_AT_decl_line : 4 > <744> DW_AT_decl_column : 8 > <745> DW_AT_sibling : <0x764> > -- > <1><764>: Abbrev Number: 29 (DW_TAG_variable) > <765> DW_AT_name : (indirect string, offset: 0x184): per_cpu_struct > <769> DW_AT_decl_file : 1 > <769> DW_AT_decl_line : 9 > <76a> DW_AT_decl_column : 8 > <76b> DW_AT_type : <0x73c> > <76f> DW_AT_location : 9 byte block: 3 0 0 0 0 0 0 0 0 (DW_OP_addr: 0) > ⬢[acme@toolbox pahole]$ > > Just dumping some more info in case Jiri or Alan thinks about something. > > - Arnaldo > >>> At this point I am not sure if gcc wrongly created the dwarf information >>> or of pahole accidentally matches the internal local_lock_t member as >>> per_cpu_struct because it starts the same address. >>> >>> >From dumpwarf the problematic entry is: >>> | 0x0000085b: DW_TAG_variable >>> | DW_AT_abstract_origin (0x0000099e "t") >>> | DW_AT_location (DW_OP_addr 0x0, DW_OP_stack_value) >>> >>> which looks slightly different in the good case: >>> | 0x00000852: DW_TAG_variable >>> | DW_AT_abstract_origin (0x00000980 "t") >>> | DW_AT_location (0x00000040: >>> | [0x0000000000000046, 0x000000000000005b): DW_OP_reg3 RBX >>> | [0x000000000000005b, 0x0000000000000061): DW_OP_addr 0x0, DW_OP_stack_value) >>> | DW_AT_GNU_locviews (0x0000003c) >>> >>> I made an archive with c file, the compiled version in bad and good case >>> (swapped the first two members), the dumpwarf output and uploaded to >>> https://breakpoint.cc/pahole-tc.tar.xz >> >> Lots of great info, I'm on it. >> >> - Arnaldo