On Tue, Sep 27, 2022 at 8:35 AM Alan Maguire <alan.maguire@xxxxxxxxxx> wrote: > > BTF deduplication is not deduplicating some structures, leading to > redundant definitions in module BTF that already have identical > definitions in vmlinux BTF. > > When examining module BTF, we can see that "struct sk_buff" is redefined > in module BTF. For example in module nf_reject_ipv4 we see: > > [114280] STRUCT 'sk_buff' size=232 vlen=28 > '(anon)' type_id=114463 bits_offset=0 > '(anon)' type_id=114464 bits_offset=192 > ... > > The rest of the fields point back at base vmlinux type ids. > > The first anon field in an sk_buff is: > > union { > struct { > struct sk_buff * next; /* 0 8 */ > struct sk_buff * prev; /* 8 8 */ > union { > struct net_device * dev; /* 16 8 */ > long unsigned int dev_scratch; /* 16 8 */ > }; /* 16 8 */ > }; > > ..and examining its BTF representation, we see > > [114463] UNION '(anon)' size=24 vlen=4 > '(anon)' type_id=114462 bits_offset=0 > 'rbnode' type_id=517 bits_offset=0 > 'list' type_id=83 bits_offset=0 > 'll_node' type_id=443 bits_offset=0 > > ...which leads us to > > [114462] STRUCT '(anon)' size=24 vlen=3 > 'next' type_id=114279 bits_offset=0 > 'prev' type_id=114279 bits_offset=64 > '(anon)' type_id=114461 bits_offset=128 > > ...finally getting back to the sk_buff: > > [114279] PTR '(anon)' type_id=114280 > > So perhaps self-referential structures are a problem for > deduplication? > > The second union with a non-base BTF id: > > union { > struct sock * sk; /* 24 8 */ > int ip_defrag_offset; /* 24 4 */ > }; > > ...points at > > [114464] UNION '(anon)' size=8 vlen=2 > 'sk' type_id=113826 bits_offset=0 > ... > > [113826] PTR '(anon)' type_id=113827 > > [113827] STRUCT 'sock' size=776 vlen=93 > ... > 'sk_error_queue' type_id=114458 bits_offset=1536 > 'sk_receive_queue' type_id=114458 bits_offset=1728 > ... > 'sk_write_queue' type_id=114458 bits_offset=2880 > ... > 'sk_socket' type_id=114295 bits_offset=4992 > ... > 'sk_memcg' type_id=113787 bits_offset=5312 > 'sk_state_change' type_id=114758 bits_offset=5376 > 'sk_data_ready' type_id=114758 bits_offset=5440 > 'sk_write_space' type_id=114758 bits_offset=5504 > 'sk_error_report' type_id=114758 bits_offset=5568 > 'sk_backlog_rcv' type_id=114292 bits_offset=5632 > 'sk_validate_xmit_skb' type_id=114760 bits_offset=5696 > 'sk_destruct' type_id=114758 bits_offset=5760 > > Again, sk_error_queue refers to a 'struct sk_buff_head': > > [114458] STRUCT 'sk_buff_head' size=24 vlen=3 > '(anon)' type_id=114457 bits_offset=0 > 'qlen' type_id=23 bits_offset=128 > 'lock' type_id=514 bits_offset=160 > > ...which, because it contains a struct sk_buff * reference > uses the not-deduped sk_buff above. > > [114455] STRUCT '(anon)' size=16 vlen=2 > 'next' type_id=114279 bits_offset=0 > 'prev' type_id=114279 bits_offset=64 > > Ditto for sk_receive_queue, sk_write_queue, etc. > > sk_memcg refers to a non-deduped struct mem_cgroup where > only one field is not in base BTF: > > [113786] STRUCT 'mem_cgroup' size=4288 vlen=46 > ... > 'move_lock_task' type_id=113694 bits_offset=31296 > ... > > and this is a pointer to task_struct: > > [113694] PTR '(anon)' type_id=113695 > > [113695] STRUCT 'task_struct' size=9792 vlen=253 > ... > 'last_wakee' type_id=113694 bits_offset=704 > ... > > ...so we see that the self-referential members cause problems here > too. > > Looking at the code, btf_dedup_is_equiv() will check equivalence > for all member types for BTF_KIND_[STRUCT|UNION]. How will such > an equivalence check function for a pointer back to the same > structure? > > With a struct, btf_dedup_struct_type() is called, and for each > candidate (hashed by name offset, member details but not type > ids), we clear the hypot_map (mapping hyothetical type > equivalences) and add a hypot_map entry mapping from the > canon_id -> cand_id in btf_dedup_is_equiv() once it looks > like a rough match. > > when we delve into its members we recurse into reference types > so should ultimately use that mapping to notice self-referential > struct equivalence. > > However looking closely, btf_dedup_is_equiv() is being called from > btf_dedup_struct_type() with arguments in the wrong order: > > eq = btf_dedup_is_equiv(d, type_id, cand_id); > > The candidate id should I think precede the type_id, as we see in > function signature: > > static int btf_dedup_is_equiv(struct btf_dedup *d, __u32 cand_id, > __u32 canon_id) > > ...and with this change the duplication disappears in the modules. > > Fixes: d5caef5b56555bfa2ac0 ("btf: add BTF types deduplication algorithm") > Signed-off-by: Alan Maguire <alan.maguire@xxxxxxxxxx> > --- > tools/lib/bpf/btf.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/tools/lib/bpf/btf.c b/tools/lib/bpf/btf.c > index b4d9a96..a4ee15c 100644 > --- a/tools/lib/bpf/btf.c > +++ b/tools/lib/bpf/btf.c > @@ -4329,7 +4329,7 @@ static int btf_dedup_struct_type(struct btf_dedup *d, __u32 type_id) > continue; > > btf_dedup_clear_hypot_map(d); > - eq = btf_dedup_is_equiv(d, type_id, cand_id); > + eq = btf_dedup_is_equiv(d, cand_id, type_id); Unfortunately this is not the right fix (and CI points this out, e.g., at [0]; so yay tests). You got confused by candidate terminology. In btf_dedup_struct_type we iterate over candidate types that could be canonical types. So what is cand_id is meant to be canonical for type identified by type_id. And type_id is pointing to a candidate type as far as an equivalence check goes (that is btf_dedup_is_equiv()). It's somewhat confusing, but really type_id is a candidate we are trying to dedup and cand_id is a *potential* canonical type (there could be multiple potential canonical types due to hash collisions). So there might be a bug with dedup, but it's somewhere else. [0] https://github.com/kernel-patches/bpf/actions/runs/3137048529/jobs/5095008504 > if (eq < 0) > return eq; > if (!eq) > -- > 1.8.3.1 >