I'm pleased to announce -v2 of the "Fast Kernel Headers" tree, which is a comprehensive rework of the Linux kernel's header hierarchy & header dependencies, with the dual goals of: - speeding up the kernel build (both absolute and incremental build times) - decoupling subsystem type & API definitions from each other The fast-headers tree consists of over 25 sub-trees internally, spanning over 2,300 commits, which can be found at: git://git.kernel.org/pub/scm/linux/kernel/git/mingo/tip.git master # HEAD: 391ce485ced0 headers/deps: Introduce the CONFIG_FAST_HEADERS=y config option Changes in -v2: - Port to v5.16-rc8 - Clang/LLVM support (with the help of Nathan Chancellor): On my 'reference distro config' the build speedup under Clang is around +88% in elapsed time and +77% in CPU time used: # # v5.16-rc8 # Performance counter stats for 'make -j96 vmlinux LLVM=1' (3 runs): 18,490,451.51 msec cpu-clock # 54.740 CPUs utilized ( +- 0.04% ) 337.788 +- 0.834 seconds time elapsed ( +- 0.25% ) # # -fast-headers-v2 # Performance counter stats for 'make -j96 vmlinux LLVM=1' (3 runs): 10,443,670.86 msec cpu-clock # 58.093 CPUs utilized ( +- 0.00% ) 179.773 +- 0.829 seconds time elapsed ( +- 0.46% ) - Unify the duplicated 'struct task_struct_per_task' into a single definition, which should address the definition ugliness reported by Greg Kroah-Hartman. - Fix bugs reported by Nathan Chancellor: - cacheline attribute definition bug - build bug with GCC plugins - fix off-tree build - Header optimizations that speed up the RDMA (infiniband) subsystem build by about +9% over -v1 and +41% over the vanilla kernel: $ perf stat --repeat 3 -e instructions,cycles,cpu-clock --sync --pre "find . -name '*.o' | xargs rm" m-rdma >/dev/null ... # v5.16-rc8: 643,570.38 msec cpu-clock # 52.253 CPUs utilized ( +- 0.06% ) 12.316 +- 0.183 seconds time elapsed ( +- 1.49% ) # -fast-headers-v1: 446,243.49 msec cpu-clock # 47.106 CPUs utilized ( +- 0.06% ) 9.4731 +- 0.0666 seconds time elapsed ( +- 0.70% ) # -fast-headers-v2: 400,650.32 msec cpu-clock # 45.888 CPUs utilized ( +- 0.02% ) 8.7310 +- 0.0162 seconds time elapsed ( +- 0.19% ) - Another round of <linux/sched.h> header footprint reductions: the header is now used in only ~36% of .c files, down from 99% in the mainline kernel and 68% in -v1. - Various bisectability improvements & other fixes & optimizations. Thanks, Ingo