* Nathan Chancellor <nathan@xxxxxxxxxx> wrote: > On Tue, Jan 04, 2022 at 11:47:30AM +0100, Ingo Molnar wrote: > > > > With the fast-headers kernel that's down to ~36,000 lines of code, > > > > almost a factor of 3 reduction: > > > > > > > > # fast-headers-v1: > > > > kepler:~/mingo.tip.git> wc -l kernel/pid.i > > > > 35941 kernel/pid.i > > > > > > Coming from someone who often has to reduce a preprocessed kernel source > > > file with creduce/cvise to report compiler bugs, this will be a very > > > welcomed change, as those tools will have to do less work, and I can get > > > my reports done faster. > > > > That's nice, didn't think of that side effect. > > > > Could you perhaps measure this too, to see how much of a benefit it is? > > As it turns out, I got an opportunity to measure this sooner rather than > later [1]. Using cvise [2] with an identical set of toolchains and > interestingness test [3], reducing net/core/skbuff.c took significantly > less time with the version from the fast-headers tree. > > v5.16-rc8: > > $ wc -l skbuff.i > 105135 skbuff.i > > $ time cvise test.fish skbuff.i > ... > ________________________________________________________ > Executed in 114.02 mins fish external > usr time 1180.43 mins 69.29 millis 1180.43 mins > sys time 229.80 mins 248.11 millis 229.79 mins > > fast-headers: > > $ wc -l skbuff.i > 78765 skbuff.i > > $ time cvise test.fish skbuff.i > ... > ________________________________________________________ > Executed in 47.38 mins fish external > usr time 620.17 mins 32.78 millis 620.17 mins > sys time 123.70 mins 122.38 millis 123.70 mins > > I was not expecting that much of a difference but it somewhat makes > sense, as the tool spends less time eliminated unused code and the > compiler invocations will be incrementally quicker as the input becomes > smaller. Indeed, that's a +140% speedup in build performance, not bad. :-) I also got around testing Clang (12) myself, and with my 'reference distro config' I got these results: # # v5.16-rc8 # Performance counter stats for 'make -j96 vmlinux LLVM=1' (3 runs): 55,638,543,274,254 instructions # 0.77 insn per cycle ( +- 0.01% ) 72,074,911,968,393 cycles # 3.901 GHz ( +- 0.04% ) 18,490,451.51 msec cpu-clock # 54.740 CPUs utilized ( +- 0.04% ) 337.788 +- 0.834 seconds time elapsed ( +- 0.25% ) # # -fast-headers-v2-rc3 # Performance counter stats for 'make -j96 vmlinux LLVM=1' (3 runs): 30,904,130,243,855 instructions # 0.76 insn per cycle ( +- 0.02% ) 40,703,482,733,690 cycles # 3.898 GHz ( +- 0.00% ) 10,443,670.86 msec cpu-clock # 58.093 CPUs utilized ( +- 0.00% ) 179.773 +- 0.829 seconds time elapsed ( +- 0.46% ) That's a +88% build speedup on Clang - even better than the +78% speedup on GCC(-10). Thanks, Ingo