Hello, kernel test robot noticed a 3.5% improvement of aim9.page_test.ops_per_sec on: commit: 4249f13c11be8b8b7bf93204185e150c3bdc968d ("maple_tree: do not preallocate nodes for slot stores") https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master testcase: aim9 test machine: 48 threads 2 sockets Intel(R) Xeon(R) CPU E5-2697 v2 @ 2.70GHz (Ivy Bridge-EP) with 112G memory parameters: testtime: 300s test: page_test cpufreq_governor: performance Details are as below: --------------------------------------------------------------------------------------------------> The kernel config and materials to reproduce are available at: https://download.01.org/0day-ci/archive/20240109/202401091651.a189376-oliver.sang@xxxxxxxxx ========================================================================================= compiler/cpufreq_governor/kconfig/rootfs/tbox_group/test/testcase/testtime: gcc-12/performance/x86_64-rhel-8.3/debian-11.1-x86_64-20220510.cgz/lkp-ivb-2ep1/page_test/aim9/300s commit: e2c27b803b ("mm/filemap: avoid buffered read/write race to read inconsistent data") 4249f13c11 ("maple_tree: do not preallocate nodes for slot stores") e2c27b803bb66474 4249f13c11be8b8b7bf93204185 ---------------- --------------------------- %stddev %change %stddev \ | \ 336518 +3.5% 348367 aim9.page_test.ops_per_sec 95019000 +3.5% 98364469 aim9.time.minor_page_faults 25318 +2.3% 25903 proc-vmstat.nr_active_anon 26605 +2.2% 27197 proc-vmstat.nr_shmem 25318 +2.3% 25903 proc-vmstat.nr_zone_active_anon 1.087e+08 +3.3% 1.122e+08 proc-vmstat.numa_hit 1.085e+08 +3.4% 1.121e+08 proc-vmstat.numa_local 1.079e+08 +3.5% 1.117e+08 proc-vmstat.pgalloc_normal 95763046 +3.5% 99109694 proc-vmstat.pgfault 1.078e+08 +3.5% 1.116e+08 proc-vmstat.pgfree 56340620 +1.4% 57128415 perf-stat.i.cache-references 3744535 -7.4% 3468589 perf-stat.i.iTLB-load-misses 923.85 +8.2% 999.87 perf-stat.i.instructions-per-iTLB-miss 318120 +3.5% 329244 perf-stat.i.minor-faults 318120 +3.5% 329244 perf-stat.i.page-faults 12.48 -0.2 12.32 perf-stat.overall.cache-miss-rate% 911.69 +8.5% 988.95 perf-stat.overall.instructions-per-iTLB-miss 56153225 +1.4% 56938073 perf-stat.ps.cache-references 3731915 -7.4% 3456934 perf-stat.ps.iTLB-load-misses 317046 +3.5% 328134 perf-stat.ps.minor-faults 317046 +3.5% 328134 perf-stat.ps.page-faults 1.54 ± 15% -0.9 0.61 ± 35% perf-profile.calltrace.cycles-pp.mas_preallocate.do_brk_flags.__do_sys_brk.do_syscall_64.entry_SYSCALL_64_after_hwframe 1.56 ± 16% -0.9 0.67 ± 18% perf-profile.children.cycles-pp.mas_preallocate 0.59 ± 18% -0.5 0.06 ± 66% perf-profile.children.cycles-pp.mas_destroy 0.03 ± 84% +0.1 0.13 ± 26% perf-profile.children.cycles-pp.anon_vma_interval_tree_insert 0.18 ± 27% +0.2 0.42 ± 15% perf-profile.children.cycles-pp.vma_adjust_trans_huge 0.28 ± 12% +0.3 0.57 ± 14% perf-profile.children.cycles-pp.vma_complete 0.20 ± 28% -0.1 0.13 ± 24% perf-profile.self.cycles-pp.security_mmap_addr 0.16 ± 23% -0.1 0.10 ± 17% perf-profile.self.cycles-pp.__perf_sw_event 0.17 ± 18% +0.1 0.27 ± 30% perf-profile.self.cycles-pp.get_vma_policy 0.02 ±118% +0.1 0.13 ± 26% perf-profile.self.cycles-pp.anon_vma_interval_tree_insert 0.08 ± 25% +0.2 0.24 ± 13% perf-profile.self.cycles-pp.vma_complete 0.18 ± 28% +0.2 0.42 ± 15% perf-profile.self.cycles-pp.vma_adjust_trans_huge Disclaimer: Results have been estimated based on internal Intel analysis and are provided for informational purposes only. Any difference in system hardware or software design or configuration may affect actual performance. -- 0-DAY CI Kernel Test Service https://github.com/intel/lkp-tests/wiki