Including Liam ... On Mon, Feb 19, 2024 at 01:44:05PM +0800, kernel test robot wrote: > > > Hello, > > kernel test robot noticed a 11.8% improvement of aim9.disk_src.ops_per_sec on: > > > commit: a616bc666748063733c62e15ea417a90772a40e0 ("libfs: Convert simple directory offsets to use a Maple Tree") > git://git.kernel.org/cgit/linux/kernel/git/cel/linux simple-offset-maple > > testcase: aim9 > test machine: 48 threads 2 sockets Intel(R) Xeon(R) CPU E5-2697 v2 @ 2.70GHz (Ivy Bridge-EP) with 112G memory > parameters: > > testtime: 300s > test: disk_src > cpufreq_governor: performance > > > > > > > Details are as below: > --------------------------------------------------------------------------------------------------> > > > The kernel config and materials to reproduce are available at: > https://download.01.org/0day-ci/archive/20240219/202402191308.8e7ee8c7-oliver.sang@xxxxxxxxx > > ========================================================================================= > compiler/cpufreq_governor/kconfig/rootfs/tbox_group/test/testcase/testtime: > gcc-12/performance/x86_64-rhel-8.3/debian-11.1-x86_64-20220510.cgz/lkp-ivb-2ep1/disk_src/aim9/300s > > commit: > f3f24869a1 ("test_maple_tree: testing the cyclic allocation") > a616bc6667 ("libfs: Convert simple directory offsets to use a Maple Tree") > > f3f24869a1d7cde1 a616bc666748063733c62e15ea4 > ---------------- --------------------------- > %stddev %change %stddev > \ | \ > 0.34 ± 4% -0.1 0.20 ± 4% mpstat.cpu.all.soft% > 0.00 ± 28% +58.3% 0.00 ± 17% perf-sched.sch_delay.max.ms.ipmi_thread.kthread.ret_from_fork.ret_from_fork_asm > 1464 ± 2% +14.0% 1668 ± 4% vmstat.system.cs > 164231 +11.8% 183678 aim9.disk_src.ops_per_sec > 1309 ± 15% +2643.5% 35915 ± 23% aim9.time.involuntary_context_switches > 91.00 +5.5% 96.00 aim9.time.percent_of_cpu_this_job_got > 212.54 +3.5% 220.06 aim9.time.system_time > 62.58 +10.2% 68.94 aim9.time.user_time > 21685 -7.1% 20144 proc-vmstat.nr_slab_reclaimable > 6611541 -88.6% 750673 ± 7% proc-vmstat.numa_hit > 6561447 -89.3% 700947 ± 7% proc-vmstat.numa_local > 5747 +3.7% 5960 proc-vmstat.pgactivate > 26113963 -93.7% 1648373 ± 17% proc-vmstat.pgalloc_normal > 26042963 -93.7% 1628178 ± 18% proc-vmstat.pgfree > 2.07 -1.2% 2.04 perf-stat.i.MPKI > 6.738e+08 +3.0% 6.94e+08 perf-stat.i.branch-instructions > 2.94 -0.2 2.70 perf-stat.i.branch-miss-rate% > 20408670 -5.1% 19363031 perf-stat.i.branch-misses > 15.11 +2.7 17.77 perf-stat.i.cache-miss-rate% > 46824224 -14.7% 39962840 perf-stat.i.cache-references > 1419 ± 2% +14.4% 1623 ± 5% perf-stat.i.context-switches > 1.88 -1.3% 1.85 perf-stat.i.cpi > 9.453e+08 +2.2% 9.659e+08 perf-stat.i.dTLB-loads > 0.22 ± 5% +0.0 0.25 ± 3% perf-stat.i.dTLB-store-miss-rate% > 8.8e+08 -6.8% 8.205e+08 perf-stat.i.dTLB-stores > 1536484 +7.9% 1657233 perf-stat.i.iTLB-load-misses > 2279 -6.0% 2142 perf-stat.i.instructions-per-iTLB-miss > 0.54 +1.3% 0.54 perf-stat.i.ipc > 786.95 +7.1% 843.12 perf-stat.i.metric.K/sec > 47.07 +1.1 48.17 perf-stat.i.node-load-miss-rate% > 87561 ± 4% +17.2% 102647 ± 6% perf-stat.i.node-load-misses > 2.01 -1.2% 1.99 perf-stat.overall.MPKI > 3.03 -0.2 2.79 perf-stat.overall.branch-miss-rate% > 15.07 +2.6 17.67 perf-stat.overall.cache-miss-rate% > 1.84 -1.2% 1.82 perf-stat.overall.cpi > 0.22 ± 5% +0.0 0.24 ± 3% perf-stat.overall.dTLB-store-miss-rate% > 2283 -6.1% 2144 perf-stat.overall.instructions-per-iTLB-miss > 0.54 +1.2% 0.55 perf-stat.overall.ipc > 44.15 +1.8 45.93 perf-stat.overall.node-load-miss-rate% > 6.715e+08 +3.0% 6.917e+08 perf-stat.ps.branch-instructions > 20340341 -5.1% 19299968 perf-stat.ps.branch-misses > 46667379 -14.7% 39829580 perf-stat.ps.cache-references > 1414 ± 2% +14.4% 1618 ± 5% perf-stat.ps.context-switches > 9.421e+08 +2.2% 9.627e+08 perf-stat.ps.dTLB-loads > 8.771e+08 -6.8% 8.178e+08 perf-stat.ps.dTLB-stores > 1531338 +7.9% 1651678 perf-stat.ps.iTLB-load-misses > 87275 ± 4% +17.3% 102341 ± 6% perf-stat.ps.node-load-misses > 5.62 ± 13% -1.9 3.69 ± 12% perf-profile.calltrace.cycles-pp.shmem_mknod.lookup_open.open_last_lookups.path_openat.do_filp_open > 7.87 ± 13% -1.9 5.95 ± 11% perf-profile.calltrace.cycles-pp.lookup_open.open_last_lookups.path_openat.do_filp_open.do_sys_openat2 > 8.47 ± 13% -1.9 6.59 ± 10% perf-profile.calltrace.cycles-pp.open_last_lookups.path_openat.do_filp_open.do_sys_openat2.__x64_sys_creat > 2.97 ± 12% -1.8 1.16 ± 13% perf-profile.calltrace.cycles-pp.simple_offset_add.shmem_mknod.lookup_open.open_last_lookups.path_openat > 0.00 +1.0 0.98 ± 13% perf-profile.calltrace.cycles-pp.mas_alloc_cyclic.mtree_alloc_cyclic.simple_offset_add.shmem_mknod.lookup_open > 0.00 +1.0 1.00 ± 40% perf-profile.calltrace.cycles-pp.rcu_do_batch.rcu_core.__do_softirq.run_ksoftirqd.smpboot_thread_fn > 0.00 +1.0 1.03 ± 40% perf-profile.calltrace.cycles-pp.rcu_core.__do_softirq.run_ksoftirqd.smpboot_thread_fn.kthread > 0.00 +1.1 1.06 ± 40% perf-profile.calltrace.cycles-pp.__do_softirq.run_ksoftirqd.smpboot_thread_fn.kthread.ret_from_fork > 0.00 +1.1 1.06 ± 40% perf-profile.calltrace.cycles-pp.run_ksoftirqd.smpboot_thread_fn.kthread.ret_from_fork.ret_from_fork_asm > 0.00 +1.1 1.10 ± 39% perf-profile.calltrace.cycles-pp.smpboot_thread_fn.kthread.ret_from_fork.ret_from_fork_asm > 0.00 +1.1 1.10 ± 14% perf-profile.calltrace.cycles-pp.mtree_alloc_cyclic.simple_offset_add.shmem_mknod.lookup_open.open_last_lookups > 0.00 +1.2 1.20 ± 13% perf-profile.calltrace.cycles-pp.mas_erase.mtree_erase.simple_offset_remove.shmem_unlink.vfs_unlink > 0.00 +1.3 1.27 ± 38% perf-profile.calltrace.cycles-pp.kthread.ret_from_fork.ret_from_fork_asm > 0.00 +1.3 1.27 ± 38% perf-profile.calltrace.cycles-pp.ret_from_fork.ret_from_fork_asm > 0.00 +1.3 1.27 ± 38% perf-profile.calltrace.cycles-pp.ret_from_fork_asm > 0.00 +1.4 1.35 ± 12% perf-profile.calltrace.cycles-pp.mtree_erase.simple_offset_remove.shmem_unlink.vfs_unlink.do_unlinkat > 15.22 ± 8% -2.8 12.40 ± 8% perf-profile.children.cycles-pp.asm_sysvec_apic_timer_interrupt > 14.50 ± 8% -2.8 11.72 ± 8% perf-profile.children.cycles-pp.sysvec_apic_timer_interrupt > 4.73 ± 13% -2.8 1.97 ± 15% perf-profile.children.cycles-pp.irq_exit_rcu > 3.50 ± 12% -2.1 1.41 ± 12% perf-profile.children.cycles-pp.kmem_cache_alloc_lru > 5.63 ± 13% -1.9 3.70 ± 12% perf-profile.children.cycles-pp.shmem_mknod > 7.88 ± 13% -1.9 5.97 ± 11% perf-profile.children.cycles-pp.lookup_open > 8.49 ± 13% -1.9 6.62 ± 10% perf-profile.children.cycles-pp.open_last_lookups > 2.97 ± 12% -1.8 1.16 ± 13% perf-profile.children.cycles-pp.simple_offset_add > 2.90 ± 22% -1.8 1.15 ± 41% perf-profile.children.cycles-pp.rcu_do_batch > 4.47 ± 14% -1.7 2.76 ± 24% perf-profile.children.cycles-pp.__do_softirq > 1.85 ± 15% -1.7 0.14 ± 28% perf-profile.children.cycles-pp.___slab_alloc > 3.00 ± 22% -1.7 1.34 ± 38% perf-profile.children.cycles-pp.rcu_core > 1.66 ± 15% -1.6 0.05 ± 68% perf-profile.children.cycles-pp.allocate_slab > 0.92 ± 18% -0.6 0.31 ± 19% perf-profile.children.cycles-pp.__call_rcu_common > 0.88 ± 27% -0.6 0.31 ± 43% perf-profile.children.cycles-pp.__slab_free > 0.28 ± 15% -0.2 0.12 ± 25% perf-profile.children.cycles-pp.xas_load > 0.20 ± 18% -0.1 0.08 ± 30% perf-profile.children.cycles-pp.rcu_segcblist_enqueue > 0.12 ± 30% -0.1 0.05 ± 65% perf-profile.children.cycles-pp.rcu_nocb_try_bypass > 0.00 +0.1 0.10 ± 27% perf-profile.children.cycles-pp.mas_wr_end_piv > 0.00 +0.2 0.17 ± 22% perf-profile.children.cycles-pp.mas_leaf_max_gap > 0.00 +0.2 0.18 ± 24% perf-profile.children.cycles-pp.mtree_range_walk > 0.00 +0.2 0.24 ± 22% perf-profile.children.cycles-pp.mas_anode_descend > 0.00 +0.3 0.29 ± 16% perf-profile.children.cycles-pp.mas_wr_walk > 0.00 +0.3 0.31 ± 23% perf-profile.children.cycles-pp.mas_update_gap > 0.00 +0.3 0.32 ± 17% perf-profile.children.cycles-pp.mas_wr_append > 0.00 +0.4 0.37 ± 15% perf-profile.children.cycles-pp.mas_empty_area > 0.00 +0.5 0.47 ± 18% perf-profile.children.cycles-pp.mas_wr_node_store > 0.00 +1.0 0.99 ± 13% perf-profile.children.cycles-pp.mas_alloc_cyclic > 0.05 ± 82% +1.0 1.10 ± 39% perf-profile.children.cycles-pp.smpboot_thread_fn > 0.01 ±264% +1.0 1.06 ± 40% perf-profile.children.cycles-pp.run_ksoftirqd > 0.22 ± 36% +1.1 1.28 ± 38% perf-profile.children.cycles-pp.ret_from_fork > 0.22 ± 36% +1.1 1.28 ± 38% perf-profile.children.cycles-pp.ret_from_fork_asm > 0.21 ± 38% +1.1 1.27 ± 38% perf-profile.children.cycles-pp.kthread > 0.00 +1.1 1.11 ± 14% perf-profile.children.cycles-pp.mtree_alloc_cyclic > 0.00 +1.2 1.21 ± 14% perf-profile.children.cycles-pp.mas_erase > 0.00 +1.4 1.35 ± 12% perf-profile.children.cycles-pp.mtree_erase > 0.87 ± 27% -0.6 0.31 ± 42% perf-profile.self.cycles-pp.__slab_free > 0.53 ± 19% -0.4 0.18 ± 23% perf-profile.self.cycles-pp.__call_rcu_common > 0.57 ± 10% -0.3 0.26 ± 21% perf-profile.self.cycles-pp.kmem_cache_alloc_lru > 0.89 ± 14% -0.3 0.59 ± 15% perf-profile.self.cycles-pp.kmem_cache_free > 0.19 ± 21% -0.1 0.06 ± 65% perf-profile.self.cycles-pp.rcu_segcblist_enqueue > 0.10 ± 20% -0.1 0.04 ± 81% perf-profile.self.cycles-pp.xas_load > 0.08 ± 19% -0.0 0.04 ± 61% perf-profile.self.cycles-pp.asm_sysvec_apic_timer_interrupt > 0.00 +0.1 0.09 ± 30% perf-profile.self.cycles-pp.mtree_erase > 0.00 +0.1 0.10 ± 26% perf-profile.self.cycles-pp.mtree_alloc_cyclic > 0.00 +0.1 0.10 ± 27% perf-profile.self.cycles-pp.mas_wr_end_piv > 0.00 +0.1 0.12 ± 38% perf-profile.self.cycles-pp.mas_empty_area > 0.00 +0.1 0.14 ± 38% perf-profile.self.cycles-pp.mas_update_gap > 0.00 +0.1 0.14 ± 20% perf-profile.self.cycles-pp.mas_wr_append > 0.00 +0.2 0.16 ± 23% perf-profile.self.cycles-pp.mas_leaf_max_gap > 0.00 +0.2 0.18 ± 24% perf-profile.self.cycles-pp.mtree_range_walk > 0.00 +0.2 0.18 ± 29% perf-profile.self.cycles-pp.mas_alloc_cyclic > 0.00 +0.2 0.22 ± 32% perf-profile.self.cycles-pp.mas_erase > 0.00 +0.2 0.24 ± 22% perf-profile.self.cycles-pp.mas_anode_descend > 0.00 +0.3 0.27 ± 16% perf-profile.self.cycles-pp.mas_wr_walk > 0.00 +0.3 0.34 ± 20% perf-profile.self.cycles-pp.mas_wr_node_store > > > > > Disclaimer: > Results have been estimated based on internal Intel analysis and are provided > for informational purposes only. Any difference in system hardware or software > design or configuration may affect actual performance. > > > -- > 0-DAY CI Kernel Test Service > https://github.com/intel/lkp-tests/wiki > -- Chuck Lever