Hello, kernel test robot noticed a 2.4% improvement of aim9.exec_test.ops_per_sec on: commit: ec9aedb2aa1ab7ac420c00b31f5edc5be15ec167 ("x86/acpi: Ignore invalid x2APIC entries") https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master testcase: aim9 test machine: 48 threads 2 sockets Intel(R) Xeon(R) CPU E5-2697 v2 @ 2.70GHz (Ivy Bridge-EP) with 112G memory parameters: testtime: 300s test: exec_test cpufreq_governor: performance besides below detailed comparison, we also noticed some difference from dmesg. for this commit ec9aedb2aa: [ 1.311075][ T0] smpboot: Allowing 48 CPUs, 0 hotplug CPUs for parent: [ 1.311098][ T0] smpboot: Allowing 168 CPUs, 120 hotplug CPUs Details are as below: --------------------------------------------------------------------------------------------------> The kernel config and materials to reproduce are available at: https://download.01.org/0day-ci/archive/20231130/202311301346.56b0fcd6-oliver.sang@xxxxxxxxx ========================================================================================= compiler/cpufreq_governor/kconfig/rootfs/tbox_group/test/testcase/testtime: gcc-12/performance/x86_64-rhel-8.3/debian-11.1-x86_64-20220510.cgz/lkp-ivb-2ep1/exec_test/aim9/300s commit: 31255e072b ("x86/shstk: Delay signal entry SSP write until after user accesses") ec9aedb2aa ("x86/acpi: Ignore invalid x2APIC entries") 31255e072b2e91f9 ec9aedb2aa1ab7ac420c00b31f5 ---------------- --------------------------- %stddev %change %stddev \ | \ 8587 ± 3% +5.9% 9091 vmstat.system.cs 6542 ± 9% -18.2% 5352 ± 7% numa-meminfo.node1.KernelStack 57960 ± 4% -12.6% 50656 ± 6% numa-meminfo.node1.SUnreclaim 6541 ± 9% -18.0% 5363 ± 6% numa-vmstat.node1.nr_kernel_stack 14490 ± 4% -12.6% 12663 ± 6% numa-vmstat.node1.nr_slab_unreclaimable 179678 ± 7% -22.6% 139060 ± 10% meminfo.DirectMap4k 13670 -13.6% 11809 meminfo.KernelStack 78243 -72.5% 21498 meminfo.Percpu 1222 +2.4% 1251 aim9.exec_test.ops_per_sec 27978802 +3.1% 28859909 aim9.time.minor_page_faults 175.04 -6.2% 164.11 aim9.time.system_time 115.72 +9.1% 126.24 aim9.time.user_time 731948 +2.4% 749684 aim9.time.voluntary_context_switches 13669 -13.8% 11788 proc-vmstat.nr_kernel_stack 21028 -3.2% 20355 proc-vmstat.nr_slab_reclaimable 29074 -9.0% 26443 proc-vmstat.nr_slab_unreclaimable 50357 -1.3% 49699 proc-vmstat.numa_other 28937047 +3.0% 29790891 proc-vmstat.pgfault 0.55 ± 5% +0.1 0.65 ± 7% perf-profile.calltrace.cycles-pp.next_uptodate_folio.filemap_map_pages.do_read_fault.do_fault.__handle_mm_fault 1.38 ± 6% -0.7 0.67 ± 9% perf-profile.children.cycles-pp.mm_init 0.87 ± 7% -0.5 0.38 ± 10% perf-profile.children.cycles-pp.pcpu_alloc 0.76 ± 8% -0.3 0.42 ± 8% perf-profile.children.cycles-pp.alloc_bprm 0.50 ± 6% -0.3 0.17 ± 6% perf-profile.children.cycles-pp.memset_orig 0.40 ± 5% -0.2 0.15 ± 18% perf-profile.children.cycles-pp.__percpu_counter_init_many 0.15 ± 20% -0.1 0.03 ±101% perf-profile.children.cycles-pp.mm_init_cid 0.23 ± 14% -0.1 0.12 ± 19% perf-profile.children.cycles-pp._find_next_bit 0.30 ± 10% -0.1 0.24 ± 16% perf-profile.children.cycles-pp.mas_preallocate 0.14 ± 18% -0.0 0.09 ± 16% perf-profile.children.cycles-pp.pm_qos_read_value 0.09 ± 15% -0.0 0.07 ± 10% perf-profile.children.cycles-pp.remove_vma 0.05 ± 47% +0.1 0.11 ± 26% perf-profile.children.cycles-pp.malloc 0.20 ± 22% +0.1 0.25 ± 7% perf-profile.children.cycles-pp.do_brk_flags 0.44 ± 5% +0.1 0.53 ± 8% perf-profile.children.cycles-pp.mod_objcg_state 0.80 ± 4% +0.2 0.96 ± 6% perf-profile.children.cycles-pp.next_uptodate_folio 0.50 ± 7% -0.3 0.17 ± 6% perf-profile.self.cycles-pp.memset_orig 0.26 ± 16% -0.2 0.04 ±106% perf-profile.self.cycles-pp.mm_init 0.14 ± 25% -0.1 0.03 ±100% perf-profile.self.cycles-pp.mm_init_cid 0.18 ± 22% -0.1 0.08 ± 34% perf-profile.self.cycles-pp.pcpu_alloc 0.13 ± 16% -0.0 0.08 ± 20% perf-profile.self.cycles-pp.pm_qos_read_value 0.37 ± 6% +0.1 0.45 ± 10% perf-profile.self.cycles-pp.mod_objcg_state 0.66 ± 5% +0.1 0.80 ± 6% perf-profile.self.cycles-pp.next_uptodate_folio 34087721 ± 2% +3.6% 35301961 perf-stat.i.branch-misses 8601 ± 3% +6.1% 9122 perf-stat.i.context-switches 72.92 ± 2% +7.4% 78.30 ± 3% perf-stat.i.cpu-migrations 1.55 ± 2% -0.1 1.42 ± 3% perf-stat.i.dTLB-load-miss-rate% 0.51 ± 2% -0.2 0.32 perf-stat.i.dTLB-store-miss-rate% 2867856 ± 3% -36.9% 1810983 perf-stat.i.dTLB-store-misses 5.561e+08 ± 2% +3.0% 5.73e+08 perf-stat.i.dTLB-stores 92019 ± 4% +10.2% 101371 perf-stat.i.iTLB-loads 126.43 ± 15% -33.8% 83.76 perf-stat.i.metric.K/sec 90050 ± 4% +6.8% 96193 perf-stat.i.minor-faults 19.22 ± 4% -1.5 17.77 ± 3% perf-stat.i.node-store-miss-rate% 90050 ± 4% +6.8% 96194 perf-stat.i.page-faults 1.48 ± 2% -0.1 1.38 ± 3% perf-stat.overall.dTLB-load-miss-rate% 0.51 -0.2 0.32 perf-stat.overall.dTLB-store-miss-rate% 33982829 ± 2% +3.5% 35183134 perf-stat.ps.branch-misses 8573 ± 3% +6.0% 9090 perf-stat.ps.context-switches 72.73 ± 2% +7.4% 78.13 ± 3% perf-stat.ps.cpu-migrations 2858954 ± 3% -36.9% 1805251 perf-stat.ps.dTLB-store-misses 5.545e+08 ± 2% +3.0% 5.712e+08 perf-stat.ps.dTLB-stores 91889 ± 4% +10.2% 101265 perf-stat.ps.iTLB-loads 89770 ± 4% +6.8% 95880 perf-stat.ps.minor-faults 89771 ± 4% +6.8% 95880 perf-stat.ps.page-faults Disclaimer: Results have been estimated based on internal Intel analysis and are provided for informational purposes only. Any difference in system hardware or software design or configuration may affect actual performance. -- 0-DAY CI Kernel Test Service https://github.com/intel/lkp-tests/wiki