Re: [RFC dwarves 5/6] btf_encoder: store ELF function representations sorted by name _and_ address

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 18/05/2023 09:39, Jiri Olsa wrote:
> On Wed, May 17, 2023 at 05:16:47PM +0100, Alan Maguire wrote:
>> By making sorting function for our ELF function list match on
>> both name and function, we ensure that the set of ELF functions
>> includes multiple copies for functions which have multiple instances
>> across CUs.  For example, cpumask_weight has 22 instances in
>> System.map/kallsyms:
>>
>> ffffffff8103b530 t cpumask_weight
>> ffffffff8103e300 t cpumask_weight
>> ffffffff81040d30 t cpumask_weight
>> ffffffff8104fa00 t cpumask_weight
>> ffffffff81064300 t cpumask_weight
>> ffffffff81082ba0 t cpumask_weight
>> ffffffff81084f50 t cpumask_weight
>> ffffffff810a4ad0 t cpumask_weight
>> ffffffff810bb740 t cpumask_weight
>> ffffffff8110a6c0 t cpumask_weight
>> ffffffff81118ab0 t cpumask_weight
>> ffffffff81129b50 t cpumask_weight
>> ffffffff81137dc0 t cpumask_weight
>> ffffffff811aead0 t cpumask_weight
>> ffffffff811d6800 t cpumask_weight
>> ffffffff811e1370 t cpumask_weight
>> ffffffff812fae80 t cpumask_weight
>> ffffffff81375c50 t cpumask_weight
>> ffffffff81634b60 t cpumask_weight
>> ffffffff817ba540 t cpumask_weight
>> ffffffff819abf30 t cpumask_weight
>> ffffffff81a7cb60 t cpumask_weight
>>
>> With ELF representations for each address, and DWARF info about
>> addresses (low_pc) we can match DWARF with ELF accurately.
>> The result for the BTF representation is that we end up with
>> a single de-duped function:
>>
>> [9287] FUNC 'cpumask_weight' type_id=9286 linkage=static
>>
>> ...and 22 DECL_TAGs for each address that point at it:
>>
>> 9288] DECL_TAG 'address=0xffffffff8103b530' type_id=9287 component_idx=-1
>> [9623] DECL_TAG 'address=0xffffffff8103e300' type_id=9287 component_idx=-1
>> [9829] DECL_TAG 'address=0xffffffff81040d30' type_id=9287 component_idx=-1
>> [11609] DECL_TAG 'address=0xffffffff8104fa00' type_id=9287 component_idx=-1
>> [13299] DECL_TAG 'address=0xffffffff81064300' type_id=9287 component_idx=-1
>> [15704] DECL_TAG 'address=0xffffffff81082ba0' type_id=9287 component_idx=-1
>> [15731] DECL_TAG 'address=0xffffffff81084f50' type_id=9287 component_idx=-1
>> [18582] DECL_TAG 'address=0xffffffff810a4ad0' type_id=9287 component_idx=-1
>> [20234] DECL_TAG 'address=0xffffffff810bb740' type_id=9287 component_idx=-1
>> [25384] DECL_TAG 'address=0xffffffff8110a6c0' type_id=9287 component_idx=-1
>> [25798] DECL_TAG 'address=0xffffffff81118ab0' type_id=9287 component_idx=-1
>> [26285] DECL_TAG 'address=0xffffffff81129b50' type_id=9287 component_idx=-1
>> [27040] DECL_TAG 'address=0xffffffff81137dc0' type_id=9287 component_idx=-1
>> [32900] DECL_TAG 'address=0xffffffff811aead0' type_id=9287 component_idx=-1
>> [35059] DECL_TAG 'address=0xffffffff811d6800' type_id=9287 component_idx=-1
>> [35353] DECL_TAG 'address=0xffffffff811e1370' type_id=9287 component_idx=-1
>> [48934] DECL_TAG 'address=0xffffffff812fae80' type_id=9287 component_idx=-1
>> [54476] DECL_TAG 'address=0xffffffff81375c50' type_id=9287 component_idx=-1
>> [87772] DECL_TAG 'address=0xffffffff81634b60' type_id=9287 component_idx=-1
>> [108841] DECL_TAG 'address=0xffffffff817ba540' type_id=9287 component_idx=-1
>> [132557] DECL_TAG 'address=0xffffffff819abf30' type_id=9287 component_idx=-1
>> [143689] DECL_TAG 'address=0xffffffff81a7cb60' type_id=9287 component_idx=-1
> 
> right, Yonghong pointed this out in:
>   https://lore.kernel.org/bpf/49e4fee2-8be0-325f-3372-c79d96b686e9@xxxxxxxx/
> 
> it's problem, because we pass btf id as attach id during bpf program load,
> and kernel does not have a way to figure out which address from the associated
> DECL_TAGs to use
> 
> if we could change dedup algo to take the function address into account and
> make it not de-duplicate equal functions with different addresses, then we
> could:
> 
>   - find btf id that properly and uniquely identifies the function we
>     want to trace
> 

So maybe a more natural approach would be to extend BTF_KIND_FUNC
(I think Alexei suggested something this earlier but I could be
misremembering) as follows:


2.2.12 BTF_KIND_FUNC
~~~~~~~~~~~~~~~~~~~~

``struct btf_type`` encoding requirement:
  * ``name_off``: offset to a valid C identifier
-  * ``info.kind_flag``: 0
+  * ``info.kind_flag``: 0 or 1 if additional ``struct btf_func`` follows
  * ``info.kind``: BTF_KIND_FUNC
  * ``info.vlen``: linkage information (BTF_FUNC_STATIC, BTF_FUNC_GLOBAL
                   or BTF_FUNC_EXTERN)
  * ``type``: a BTF_KIND_FUNC_PROTO type

- No additional type data follow ``btf_type``.
+ If ``info.kind_flag`` is specified, a ``struct btf_func`` follows.::
+
+	struct btf_func {
+		__u64 addr;
+	};
+ Otherwise no additional type data follows ``btf_type``.


With the above, dedup could be made to fail when functions have non-
identical associated addresses. Judging by the number of DECL_TAGs in
the RFC, we'd end up with ~1000 extra BTF_KIND_FUNCs, and the extra
space for struct btf_funcs would require roughly 400k. We'd still get
dedup of FUNC_PROTOs, so I suspect the extra size would be < 1MB.



>   - store the vmlinux base address and treat stored function addresses as
>     offsets, so the verifier can get proper address even if the kernel
>     is relocated
>

yep; when we read kernel/module BTF in we could hopefully carry out
this recalculation and update the vmlinux/module BTF addresses
accordingly.

Thanks!

Alan




[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux