Hi maintainers, A gentle ping. Thanks, Shaokun On 2021/4/20 14:10, kernel test robot wrote: > > > Greeting, > > FYI, we noticed a 19.2% improvement of unixbench.score due to commit: > > > commit: aec499039e7b21224ef29e5a2daba328aec14442 ("[PATCH] fs: Optimized file struct to improve performance") > url: https://github.com/0day-ci/linux/commits/Shaokun-Zhang/fs-Optimized-file-struct-to-improve-performance/20210409-114859 > base: https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git 5e46d1b78a03d52306f21f77a4e4a144b6d31486 > > in testcase: unixbench > on test machine: 96 threads Intel(R) Xeon(R) CPU @ 2.30GHz with 128G memory > with following parameters: > > runtime: 300s > nr_task: 30% > test: syscall > cpufreq_governor: performance > ucode: 0x4003006 > > test-description: UnixBench is the original BYTE UNIX benchmark suite aims to test performance of Unix-like system. > test-url: https://github.com/kdlucas/byte-unixbench > > > > > > Details are as below: > --------------------------------------------------------------------------------------------------> > > > To reproduce: > > git clone https://github.com/intel/lkp-tests.git > cd lkp-tests > bin/lkp install job.yaml # job file is attached in this email > bin/lkp split-job --compatible job.yaml > bin/lkp run compatible-job.yaml > > ========================================================================================= > compiler/cpufreq_governor/kconfig/nr_task/rootfs/runtime/tbox_group/test/testcase/ucode: > gcc-9/performance/x86_64-rhel-8.3/30%/debian-10.4-x86_64-20200603.cgz/300s/lkp-csl-2sp4/syscall/unixbench/0x4003006 > > commit: > 5e46d1b78a ("reiserfs: update reiserfs_xattrs_initialized() condition") > aec499039e ("fs: Optimized file struct to improve performance") > > 5e46d1b78a03d523 aec499039e7b21224ef29e5a2da > ---------------- --------------------------- > %stddev %change %stddev > \ | \ > 2768 +19.2% 3298 unixbench.score > 176.43 +19.8% 211.43 unixbench.time.user_time > 1.622e+09 +19.2% 1.933e+09 unixbench.workload > 348.17 ± 48% -25.2% 260.57 ± 68% proc-vmstat.nr_mlock > 4081405 ±133% -99.2% 33639 ± 15% turbostat.C1 > 1.348e+10 ± 89% -76.6% 3.151e+09 ±190% cpuidle.C6.time > 1360129 ±137% -86.4% 184629 ± 2% cpuidle.POLL.time > 1.00 ± 10% -0.2 0.81 ± 3% mpstat.cpu.all.irq% > 0.49 +0.1 0.59 mpstat.cpu.all.usr% > 0.01 ± 23% -36.4% 0.00 ± 13% perf-sched.sch_delay.avg.ms.schedule_timeout.rcu_gp_kthread.kthread.ret_from_fork > 0.06 ± 43% -48.4% 0.03 ± 42% perf-sched.sch_delay.max.ms.do_task_dead.do_exit.do_group_exit.__x64_sys_exit_group.do_syscall_64 > 0.05 ± 49% -55.1% 0.02 ± 47% perf-sched.sch_delay.max.ms.schedule_hrtimeout_range_clock.ep_poll.do_epoll_wait.__x64_sys_epoll_wait > 765.20 ± 20% -34.3% 502.83 ± 29% perf-sched.wait_and_delay.avg.ms.schedule_timeout.io_schedule_timeout.wait_for_completion_io.blk_execute_rq > 1930 ± 13% -31.8% 1316 ± 30% perf-sched.wait_and_delay.max.ms.schedule_timeout.io_schedule_timeout.wait_for_completion_io.blk_execute_rq > 765.19 ± 20% -34.3% 502.82 ± 29% perf-sched.wait_time.avg.ms.schedule_timeout.io_schedule_timeout.wait_for_completion_io.blk_execute_rq > 1930 ± 13% -31.8% 1316 ± 30% perf-sched.wait_time.max.ms.schedule_timeout.io_schedule_timeout.wait_for_completion_io.blk_execute_rq > 2787 ±215% -100.0% 0.71 ±162% interrupts.124:PCI-MSI.31981657-edge.i40e-eth0-TxRx-88 > 385.17 ±128% -99.9% 0.29 ±158% interrupts.61:PCI-MSI.31981594-edge.i40e-eth0-TxRx-25 > 4052 ± 49% -57.3% 1732 ±102% interrupts.CPU27.NMI:Non-maskable_interrupts > 4052 ± 49% -57.3% 1732 ±102% interrupts.CPU27.PMI:Performance_monitoring_interrupts > 438.67 ±122% +697.3% 3497 ± 37% interrupts.CPU3.NMI:Non-maskable_interrupts > 438.67 ±122% +697.3% 3497 ± 37% interrupts.CPU3.PMI:Performance_monitoring_interrupts > 289.00 ± 84% +1542.3% 4746 ± 24% interrupts.CPU51.NMI:Non-maskable_interrupts > 289.00 ± 84% +1542.3% 4746 ± 24% interrupts.CPU51.PMI:Performance_monitoring_interrupts > 135.17 ± 18% -29.9% 94.71 ± 26% interrupts.CPU59.RES:Rescheduling_interrupts > 4872 ± 27% -48.9% 2490 ± 90% interrupts.CPU74.NMI:Non-maskable_interrupts > 4872 ± 27% -48.9% 2490 ± 90% interrupts.CPU74.PMI:Performance_monitoring_interrupts > 2786 ±215% -100.0% 0.43 ±169% interrupts.CPU88.124:PCI-MSI.31981657-edge.i40e-eth0-TxRx-88 > 13.38 ± 7% -13.4 0.00 perf-profile.calltrace.cycles-pp.dnotify_flush.filp_close.__x64_sys_close.do_syscall_64.entry_SYSCALL_64_after_hwframe > 30.66 ± 9% -6.4 24.27 ± 10% perf-profile.calltrace.cycles-pp.filp_close.__x64_sys_close.do_syscall_64.entry_SYSCALL_64_after_hwframe > 30.82 ± 9% -6.4 24.46 ± 10% perf-profile.calltrace.cycles-pp.__x64_sys_close.do_syscall_64.entry_SYSCALL_64_after_hwframe > 7.10 ± 8% -1.3 5.85 ± 11% perf-profile.calltrace.cycles-pp.filp_close.__x64_sys_close.do_syscall_64.entry_SYSCALL_64_after_hwframe.__close > 7.14 ± 8% -1.2 5.89 ± 11% perf-profile.calltrace.cycles-pp.__x64_sys_close.do_syscall_64.entry_SYSCALL_64_after_hwframe.__close > 7.18 ± 8% -1.2 5.93 ± 11% perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.__close > 7.15 ± 8% -1.2 5.91 ± 11% perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.__close > 7.27 ± 8% -1.2 6.04 ± 11% perf-profile.calltrace.cycles-pp.__close > 5.29 ± 8% +5.4 10.68 ± 10% perf-profile.calltrace.cycles-pp.fput_many.filp_close.__x64_sys_close.do_syscall_64.entry_SYSCALL_64_after_hwframe > 13.39 ± 7% -13.3 0.07 ± 12% perf-profile.children.cycles-pp.dnotify_flush > 37.79 ± 8% -7.6 30.16 ± 10% perf-profile.children.cycles-pp.filp_close > 37.97 ± 8% -7.6 30.36 ± 10% perf-profile.children.cycles-pp.__x64_sys_close > 7.30 ± 8% -1.2 6.07 ± 11% perf-profile.children.cycles-pp.__close > 0.70 ± 10% -0.1 0.56 ± 10% perf-profile.children.cycles-pp.hrtimer_interrupt > 0.71 ± 11% -0.1 0.57 ± 10% perf-profile.children.cycles-pp.__sysvec_apic_timer_interrupt > 0.39 ± 16% -0.1 0.29 ± 9% perf-profile.children.cycles-pp.__hrtimer_run_queues > 0.27 ± 13% -0.1 0.22 ± 10% perf-profile.children.cycles-pp.tick_sched_timer > 5.29 ± 8% +5.4 10.69 ± 10% perf-profile.children.cycles-pp.fput_many > 13.33 ± 7% -13.3 0.06 ± 11% perf-profile.self.cycles-pp.dnotify_flush > 5.27 ± 8% +5.4 10.64 ± 10% perf-profile.self.cycles-pp.fput_many > 17.97 ± 46% -58.2% 7.51 ± 16% perf-stat.i.MPKI > 1.073e+09 +16.2% 1.247e+09 perf-stat.i.branch-instructions > 2.60 ± 34% -1.0 1.62 ± 2% perf-stat.i.branch-miss-rate% > 56435130 ± 26% -31.6% 38588360 ± 21% perf-stat.i.cache-references > 12.06 ± 3% -15.8% 10.16 perf-stat.i.cpi > 0.10 ±100% -0.1 0.02 ±202% perf-stat.i.dTLB-load-miss-rate% > 1.682e+09 +16.9% 1.965e+09 perf-stat.i.dTLB-loads > 0.03 ± 93% -0.0 0.01 ±142% perf-stat.i.dTLB-store-miss-rate% > 1.11e+09 +17.5% 1.304e+09 perf-stat.i.dTLB-stores > 5.314e+09 +16.2% 6.176e+09 perf-stat.i.instructions > 0.10 ± 11% +18.1% 0.12 ± 2% perf-stat.i.ipc > 40.93 +16.1% 47.51 perf-stat.i.metric.M/sec > 89.63 ± 2% +6.5 96.16 perf-stat.i.node-load-miss-rate% > 3653512 ± 3% -57.3% 1561506 perf-stat.i.node-load-misses > 371566 ± 19% -90.8% 34031 ± 8% perf-stat.i.node-loads > 10.59 ± 25% -40.8% 6.27 ± 23% perf-stat.overall.MPKI > 1.92 ± 8% -0.3 1.61 perf-stat.overall.branch-miss-rate% > 13.04 -14.2% 11.19 perf-stat.overall.cpi > 0.02 ± 89% -0.0 0.00 ±148% perf-stat.overall.dTLB-load-miss-rate% > 0.00 ± 72% -0.0 0.00 ± 69% perf-stat.overall.dTLB-store-miss-rate% > 318.50 +13.2% 360.58 perf-stat.overall.instructions-per-iTLB-miss > 0.08 +16.6% 0.09 perf-stat.overall.ipc > 90.76 ± 2% +7.1 97.87 perf-stat.overall.node-load-miss-rate% > 1286 -2.7% 1251 perf-stat.overall.path-length > 1.072e+09 +16.2% 1.246e+09 perf-stat.ps.branch-instructions > 1.68e+09 +16.9% 1.964e+09 perf-stat.ps.dTLB-loads > 1.109e+09 +17.6% 1.303e+09 perf-stat.ps.dTLB-stores > 5.307e+09 +16.3% 6.171e+09 perf-stat.ps.instructions > 3649615 ± 3% -57.2% 1560409 perf-stat.ps.node-load-misses > 371135 ± 19% -90.9% 33946 ± 8% perf-stat.ps.node-loads > 2.086e+12 +16.0% 2.419e+12 perf-stat.total.instructions > 10629 ± 12% -17.7% 8746 ± 8% softirqs.CPU10.RCU > 9891 ± 7% -14.6% 8447 ± 9% softirqs.CPU13.RCU > 43153 ± 3% -7.4% 39975 ± 4% softirqs.CPU30.SCHED > 9938 ± 6% -12.9% 8660 ± 2% softirqs.CPU33.RCU > 9900 ± 9% -14.1% 8500 ± 5% softirqs.CPU38.RCU > 9730 ± 6% -10.3% 8731 ± 7% softirqs.CPU40.RCU > 10238 ± 8% -15.0% 8703 ± 9% softirqs.CPU44.RCU > 10045 ± 10% -15.7% 8471 ± 6% softirqs.CPU45.RCU > 10074 ± 7% -15.4% 8524 ± 6% softirqs.CPU46.RCU > 9793 ± 6% -12.0% 8617 ± 8% softirqs.CPU49.RCU > 10809 ± 18% -19.0% 8750 ± 8% softirqs.CPU50.RCU > 10484 ± 7% -13.3% 9088 ± 10% softirqs.CPU53.RCU > 10059 ± 7% -13.2% 8732 ± 7% softirqs.CPU54.RCU > 10298 ± 4% -13.5% 8912 ± 7% softirqs.CPU55.RCU > 9932 ± 8% -12.4% 8699 ± 5% softirqs.CPU60.RCU > 10268 ± 9% -17.1% 8514 ± 7% softirqs.CPU61.RCU > 9895 ± 5% -9.0% 9008 ± 5% softirqs.CPU67.RCU > 10294 ± 8% -12.0% 9060 ± 5% softirqs.CPU68.RCU > 11048 ± 14% -17.2% 9152 ± 6% softirqs.CPU69.RCU > 9586 ± 7% -9.1% 8715 ± 5% softirqs.CPU74.RCU > 9555 ± 7% -10.1% 8587 ± 5% softirqs.CPU76.RCU > 9892 ± 10% -14.8% 8425 ± 5% softirqs.CPU80.RCU > 9722 ± 6% -13.5% 8407 ± 6% softirqs.CPU82.RCU > 9883 ± 6% -12.7% 8624 ± 4% softirqs.CPU83.RCU > 9507 ± 5% -9.9% 8567 ± 4% softirqs.CPU84.RCU > 9878 ± 8% -14.1% 8485 ± 3% softirqs.CPU85.RCU > 37959 ± 4% -12.9% 33055 ± 6% softirqs.CPU85.SCHED > 10338 ± 12% -16.6% 8623 ± 4% softirqs.CPU86.RCU > 9885 ± 8% -14.8% 8423 ± 4% softirqs.CPU87.RCU > 9934 ± 7% -12.9% 8649 ± 5% softirqs.CPU88.RCU > 10119 ± 8% -16.0% 8502 ± 5% softirqs.CPU89.RCU > 9958 ± 7% -13.5% 8612 ± 4% softirqs.CPU92.RCU > 9917 ± 8% -14.3% 8498 ± 5% softirqs.CPU93.RCU > 10070 ± 8% -14.3% 8625 ± 6% softirqs.CPU94.RCU > 10157 ± 11% -11.7% 8967 ± 7% softirqs.CPU95.RCU > 19377 ± 60% -69.7% 5871 ± 82% softirqs.NET_RX > 944995 ± 4% -10.5% 845954 ± 6% softirqs.RCU > > > > unixbench.score > > 3400 +--------------------------------------------------------------------+ > 3300 |-+O O O OO OO OO OO | > |O O OO O O O | > 3200 |-+ O O O O | > 3100 |-+ | > | | > 3000 |-+ | > 2900 |-+ | > 2800 |-+ .+ ++. | > | +. .++ +.+++.++.++.++.++ :+ +| > 2700 |+.++. .+ +. +.+ .++.+ + .++ + : + | > 2600 |-+ ++ + :.+ + ++ + + + | > | + : + :+ | > 2500 |-+ + + | > 2400 +--------------------------------------------------------------------+ > > > unixbench.workload > > 2e+09 +-----------------------------------------------------------------+ > | O O O OO OO OOO O | > 1.9e+09 |O+ O OO O O | > | O O O O O | > | | > 1.8e+09 |-+ | > | | > 1.7e+09 |-+ | > | + ++. | > 1.6e+09 |-+ + + .+ ++ +.+++.++.+++.++.+ :+ +| > |+.+ + + :+ +.+ .+++ +. +.+ + : + | > | ++ .+ + + + ++ + + + | > 1.5e+09 |-+ + + :+ | > | + | > 1.4e+09 +-----------------------------------------------------------------+ > > > [*] bisect-good sample > [O] bisect-bad sample > > > > Disclaimer: > Results have been estimated based on internal Intel analysis and are provided > for informational purposes only. Any difference in system hardware or software > design or configuration may affect actual performance. > > > --- > 0DAY/LKP+ Test Infrastructure Open Source Technology Center > https://lists.01.org/hyperkitty/list/lkp@xxxxxxxxxxxx Intel Corporation > > Thanks, > Oliver Sang >