Em Mon, Jan 30, 2023 at 09:20:37AM -0800, Alexei Starovoitov escreveu: > On Mon, Jan 30, 2023 at 02:29:45PM +0000, Alan Maguire wrote: > > There are multiple sources of inconsistency that can result in > > functions of the same name having multiple prototypes: > > > > - multiple static functions in different CUs share the same name > > - static and external functions share the same name > > > > Here we attempt to catch such cases by finding inconsistencies > > across CUs using the save/compare/merge mechanisms that were > > previously introduced to handle optimized-out parameters, > > using it for all functions. > > > > For two instances of a function to be considered consistent: > > > > - number of parameters must match > > - parameter names must match > > > > The latter is a less strong method than a full type > > comparison but suffices to match functions. > > > > With these changes, we see 278 functions removed due to > > protoype inconsistency. For example, wakeup_show() > > has two distinct prototypes: > > > > static ssize_t wakeup_show(struct kobject *kobj, > > struct kobj_attribute *attr, char *buf) > > (from kernel/irq/irqdesc.c) > > > > static ssize_t wakeup_show(struct device *dev, struct device_attribute *attr, > > char *buf) > > (from drivers/base/power/sysfs.c) > > > > In some other cases, the parameter comparisons weed out additional > > inconsistencies in "."-suffixed functions across CUs. > > > > We also see a large number of functions eliminated due to > > optimized-out parameters; 2542 functions are eliminated for this > > reason, both "."-suffixed (1007) and otherwise (1535). > > imo it's a good thing. > > > Because the save/compare/merge process occurs for all functions > > it is important to assess performance effects. In addition, > > prior to these changes the number of functions ultimately > > represented in BTF was non-deterministic when pahole was > > run with multiple threads. This was due to the fact that > > functions were marked as generated on a per-encoder basis > > when first added, and as such the same function could > > be added multiple times for different encoders, and if they > > encountered inconsistent function prototypes, deduplication > > could leave multiple entries in place for the same name. > > When run in a single thread, the "generated" state associated > > with the name would prevent this. > > > > Here we assess both BTF encoding performance and determinism > > of the function representation in baseline compared to with > > these changes. Determinism is assessed by counting the > > number of functions in BTF. Comparisons are done for 1, > > 4 and 8 threads. > > > > Baseline > > > > $ time LLVM_OBJCOPY=objcopy pahole -J vmlinux > > > > real 0m18.160s > > user 0m17.179s > > sys 0m0.757s > > > > $ grep " FUNC " /tmp/vmlinux.btf.base |awk '{print $3}'|sort|wc -l > > 51150 > > $ grep " FUNC " /tmp/vmlinux.btf.base |awk '{print $3}'|sort|uniq|wc -l > > 51150 > > > > $ time LLVM_OBJCOPY=objcopy pahole -J -j4 vmlinux > > > > real 0m8.078s > > user 0m17.978s > > sys 0m0.732s > > > > $ grep " FUNC " /tmp/vmlinux.btf.base |awk '{print $3}'|sort|wc -l > > 51592 > > $ grep " FUNC " /tmp/vmlinux.btf.base |awk '{print $3}'|sort|uniq|wc -l > > 51150 > > > > $ time LLVM_OBJCOPY=objcopy pahole -J -j8 vmlinux > > > > real 0m7.075s > > user 0m19.010s > > sys 0m0.587s > > > > $ grep " FUNC " /tmp/vmlinux.btf.base |awk '{print $3}'|sort|wc -l > > 51683 > > $ grep " FUNC " /tmp/vmlinux.btf.base |awk '{print $3}'|sort|uniq|wc -l > > 51150 > > Ouch. I didn't realize it is so random currently. > > > Test: > > > > $ time LLVM_OBJCOPY=objcopy pahole -J vmlinux > > > > real 0m19.039s > > user 0m17.617s > > sys 0m1.419s > > $ bpftool btf dump file vmlinux | grep ' FUNC ' |sort|wc -l > > 49871 > > $ bpftool btf dump file vmlinux | grep ' FUNC ' |sort|uniq|wc -l > > 49871 > > > > $ time LLVM_OBJCOPY=objcopy pahole -J -j4 vmlinux > > > > real 0m8.482s > > user 0m18.233s > > sys 0m2.412s > > $ bpftool btf dump file vmlinux | grep ' FUNC ' |sort|wc -l > > 49871 > > $ bpftool btf dump file vmlinux | grep ' FUNC ' |sort|uniq|wc -l > > 49871 > > > > $ time LLVM_OBJCOPY=objcopy pahole -J -j8 vmlinux > > > > real 0m7.614s > > user 0m19.384s > > sys 0m3.739s > > $ bpftool btf dump file vmlinux | grep ' FUNC ' |sort|wc -l > > 49871 > > $ bpftool btf dump file vmlinux | grep ' FUNC ' |sort|uniq|wc -l > > > > So there is a small cost in performance, but we improve determinism > > and the consistency of representation. > > This is a great fix. > > I'm not an expert in this code base, but patches look good to me. > Thank you for fixing it. And all the description of the problem and of the solution, limitations, together with a summary of the review comments, its a pleasure to process a patch series like this one :-) Doing that now and performing the usual tests, Thanks, - Arnaldo > > For now it is better to have an incomplete representation > > that more accurately reflects the actual function parameters > > used, removing inconsistencies that could otherwise do harm. > > +1 -- - Arnaldo