Re: [PATCH v2 dwarves 5/5] btf_encoder: delay function addition to check for function prototype inconsistencies

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Jan 30, 2023 at 02:29:45PM +0000, Alan Maguire wrote:
> There are multiple sources of inconsistency that can result in
> functions of the same name having multiple prototypes:
> 
> - multiple static functions in different CUs share the same name
> - static and external functions share the same name
> 
> Here we attempt to catch such cases by finding inconsistencies
> across CUs using the save/compare/merge mechanisms that were
> previously introduced to handle optimized-out parameters,
> using it for all functions.
> 
> For two instances of a function to be considered consistent:
> 
> - number of parameters must match
> - parameter names must match
> 
> The latter is a less strong method than a full type
> comparison but suffices to match functions.
> 
> With these changes, we see 278 functions removed due to
> protoype inconsistency.  For example, wakeup_show()
> has two distinct prototypes:
> 
> static ssize_t wakeup_show(struct kobject *kobj,
>                            struct kobj_attribute *attr, char *buf)
> (from kernel/irq/irqdesc.c)
> 
> static ssize_t wakeup_show(struct device *dev, struct device_attribute *attr,
>                            char *buf)
> (from drivers/base/power/sysfs.c)
> 
> In some other cases, the parameter comparisons weed out additional
> inconsistencies in "."-suffixed functions across CUs.
> 
> We also see a large number of functions eliminated due to
> optimized-out parameters; 2542 functions are eliminated for this
> reason, both "."-suffixed (1007) and otherwise (1535).

imo it's a good thing.

> Because the save/compare/merge process occurs for all functions
> it is important to assess performance effects.  In addition,
> prior to these changes the number of functions ultimately
> represented in BTF was non-deterministic when pahole was
> run with multiple threads.  This was due to the fact that
> functions were marked as generated on a per-encoder basis
> when first added, and as such the same function could
> be added multiple times for different encoders, and if they
> encountered inconsistent function prototypes, deduplication
> could leave multiple entries in place for the same name.
> When run in a single thread, the "generated" state associated
> with the name would prevent this.
> 
> Here we assess both BTF encoding performance and determinism
> of the function representation in baseline compared to with
> these changes.  Determinism is assessed by counting the
> number of functions in BTF.  Comparisons are done for 1,
> 4 and 8 threads.
> 
> Baseline
> 
> $ time LLVM_OBJCOPY=objcopy pahole -J vmlinux
> 
> real	0m18.160s
> user	0m17.179s
> sys	0m0.757s
> 
> $ grep " FUNC " /tmp/vmlinux.btf.base |awk '{print $3}'|sort|wc -l
> 51150
> $ grep " FUNC " /tmp/vmlinux.btf.base |awk '{print $3}'|sort|uniq|wc -l
> 51150
> 
> $ time LLVM_OBJCOPY=objcopy pahole -J -j4 vmlinux
> 
> real	0m8.078s
> user	0m17.978s
> sys	0m0.732s
> 
> $ grep " FUNC " /tmp/vmlinux.btf.base |awk '{print $3}'|sort|wc -l
> 51592
> $ grep " FUNC " /tmp/vmlinux.btf.base |awk '{print $3}'|sort|uniq|wc -l
> 51150
> 
> $ time LLVM_OBJCOPY=objcopy pahole -J -j8 vmlinux
> 
> real	0m7.075s
> user	0m19.010s
> sys	0m0.587s
> 
> $ grep " FUNC " /tmp/vmlinux.btf.base |awk '{print $3}'|sort|wc -l
> 51683
> $ grep " FUNC " /tmp/vmlinux.btf.base |awk '{print $3}'|sort|uniq|wc -l
> 51150

Ouch. I didn't realize it is so random currently.

> Test:
> 
> $ time LLVM_OBJCOPY=objcopy pahole -J  vmlinux
> 
> real	0m19.039s
> user	0m17.617s
> sys	0m1.419s
> $ bpftool btf dump file vmlinux | grep ' FUNC ' |sort|wc -l
> 49871
> $ bpftool btf dump file vmlinux | grep ' FUNC ' |sort|uniq|wc -l
> 49871
> 
> $ time LLVM_OBJCOPY=objcopy pahole -J -j4 vmlinux
> 
> real	0m8.482s
> user	0m18.233s
> sys	0m2.412s
> $ bpftool btf dump file vmlinux | grep ' FUNC ' |sort|wc -l
> 49871
> $ bpftool btf dump file vmlinux | grep ' FUNC ' |sort|uniq|wc -l
> 49871
> 
> $ time LLVM_OBJCOPY=objcopy pahole -J -j8 vmlinux
> 
> real	0m7.614s
> user	0m19.384s
> sys	0m3.739s
> $ bpftool btf dump file vmlinux | grep ' FUNC ' |sort|wc -l
> 49871
> $ bpftool btf dump file vmlinux | grep ' FUNC ' |sort|uniq|wc -l
> 
> So there is a small cost in performance, but we improve determinism
> and the consistency of representation.

This is a great fix.

I'm not an expert in this code base, but patches look good to me.
Thank you for fixing it.

> For now it is better to have an incomplete representation
> that more accurately reflects the actual function parameters
> used, removing inconsistencies that could otherwise do harm.

+1



[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux