[ANNOUNCE] "Fast Kernel Headers" Tree -v2

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



I'm pleased to announce -v2 of the "Fast Kernel Headers" tree, which is a 
comprehensive rework of the Linux kernel's header hierarchy & header 
dependencies, with the dual goals of:

 - speeding up the kernel build (both absolute and incremental build times)

 - decoupling subsystem type & API definitions from each other

The fast-headers tree consists of over 25 sub-trees internally, spanning 
over 2,300 commits, which can be found at:

   git://git.kernel.org/pub/scm/linux/kernel/git/mingo/tip.git master

   # HEAD: 391ce485ced0 headers/deps: Introduce the CONFIG_FAST_HEADERS=y config option

Changes in -v2:

 - Port to v5.16-rc8

 - Clang/LLVM support (with the help of Nathan Chancellor):

   On my 'reference distro config' the build speedup under Clang is around +88%
   in elapsed time and +77% in CPU time used:

     #
     # v5.16-rc8
     #
     Performance counter stats for 'make -j96 vmlinux LLVM=1' (3 runs):

      18,490,451.51 msec cpu-clock          # 54.740 CPUs utilized   ( +-  0.04% )

      337.788 +- 0.834 seconds time elapsed  ( +-  0.25% )

     #
     # -fast-headers-v2
     #
     Performance counter stats for 'make -j96 vmlinux LLVM=1' (3 runs):

      10,443,670.86 msec cpu-clock          # 58.093 CPUs utilized   ( +-  0.00% )

      179.773 +- 0.829 seconds time elapsed  ( +-  0.46% )

 - Unify the duplicated 'struct task_struct_per_task' into a single definition,
   which should address the definition ugliness reported by Greg Kroah-Hartman.

 - Fix bugs reported by Nathan Chancellor:

    - cacheline attribute definition bug
    - build bug with GCC plugins
    - fix off-tree build

 - Header optimizations that speed up the RDMA (infiniband) subsystem build 
   by about +9% over -v1 and +41% over the vanilla kernel:

     $ perf stat --repeat 3 -e instructions,cycles,cpu-clock --sync --pre "find . -name '*.o' | xargs rm" m-rdma >/dev/null
     ...

     # v5.16-rc8:

          643,570.38 msec cpu-clock                 #   52.253 CPUs utilized            ( +-  0.06% )

               12.316 +- 0.183 seconds time elapsed  ( +-  1.49% )

     # -fast-headers-v1:
          446,243.49 msec cpu-clock                 #   47.106 CPUs utilized            ( +-  0.06% )

                9.4731 +- 0.0666 seconds time elapsed  ( +-  0.70% )

     # -fast-headers-v2:
          400,650.32 msec cpu-clock                 #   45.888 CPUs utilized            ( +-  0.02% )

                8.7310 +- 0.0162 seconds time elapsed  ( +-  0.19% )

  - Another round of <linux/sched.h> header footprint reductions: the 
    header is now used in only ~36% of .c files, down from 99% in the 
    mainline kernel and 68% in -v1.

  - Various bisectability improvements & other fixes & optimizations.

Thanks,

	Ingo



[Index of Archives]     [Linux Kernel]     [Kernel Newbies]     [x86 Platform Driver]     [Netdev]     [Linux Wireless]     [Netfilter]     [Bugtraq]     [Linux Filesystems]     [Yosemite Discussion]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Samba]     [Device Mapper]

  Powered by Linux