By default, coccicheck utilizes all available threads to implement parallelisation. However, when all available threads are used, a decrease in performance is noted. The elapsed time is minimum when at most one thread per core is used. For example, on benchmarking the semantic patch kfree.cocci for usb/serial using hyperfine, the outputs obtained for J=5 and J=2 are 1.32 and 1.90 times faster than those for J=10 and J=9 respectively for two separate runs. For the larger drivers/staging directory, minimium elapsed time is obtained for J=3 which is 1.86 times faster than that for J=12. The optimal J value does not exceed 6 in any of the test runs. The benchmarks are run on a machine with 6 cores, with 2 threads per core, i.e, 12 hyperthreads in all. To improve performance, modify coccicheck to use at most only one thread per core by default. Signed-off-by: Sumera Priyadarsini <sylphrenadin@xxxxxxxxx> --- Changes in V2: - Change commit message as suggested by Julia Lawall Changes in V3: - Use J/2 as optimal value for machines with more than 8 hyperthreads as well. --- scripts/coccicheck | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/scripts/coccicheck b/scripts/coccicheck index e04d328210ac..a72aa6c037ff 100755 --- a/scripts/coccicheck +++ b/scripts/coccicheck @@ -75,8 +75,13 @@ else OPTIONS="--dir $KBUILD_EXTMOD $COCCIINCLUDE" fi + # Use only one thread per core by default if hyperthreading is enabled + THREADS_PER_CORE=$(lscpu | grep "Thread(s) per core: " | tr -cd [:digit:]) if [ -z "$J" ]; then NPROC=$(getconf _NPROCESSORS_ONLN) + if [ $THREADS_PER_CORE -gt 1 -a $NPROC -gt 2 ] ; then + NPROC=$((NPROC/2)) + fi else NPROC="$J" fi -- 2.25.1