On Mon, 2024-09-09 at 16:24 +0800, kernel test robot wrote: > > Hello, > > kernel test robot noticed a -5.5% regression of will-it-scale.per_thread_ops on: > > > commit: a037d5e7f81bae8ff69eb670b2ec3f25ad4d2cc2 ("fs: add infrastructure for multigrain timestamps") > https://git.kernel.org/cgit/linux/kernel/git/vfs/vfs.git vfs.mgtime > > testcase: will-it-scale > test machine: 64 threads 2 sockets Intel(R) Xeon(R) Gold 6346 CPU @ 3.10GHz (Ice Lake) with 256G memory > parameters: > > nr_task: 100% > mode: thread > test: pipe1 > cpufreq_governor: performance > > > In addition to that, the commit also has significant impact on the following tests: > > +------------------+----------------------------------------------------------------------------------------------------+ > > testcase: change | will-it-scale: will-it-scale.per_thread_ops -2.4% regression | > > test machine | 64 threads 2 sockets Intel(R) Xeon(R) Gold 6346 CPU @ 3.10GHz (Ice Lake) with 256G memory | > > test parameters | cpufreq_governor=performance | > > | mode=thread | > > | nr_task=100% | > > | test=writeseek1 | > +------------------+----------------------------------------------------------------------------------------------------+ > > testcase: change | will-it-scale: will-it-scale.per_thread_ops -5.5% regression | > > test machine | 224 threads 4 sockets Intel(R) Xeon(R) Platinum 8380H CPU @ 2.90GHz (Cooper Lake) with 192G memory | > > test parameters | cpufreq_governor=performance | > > | mode=thread | > > | nr_task=100% | > > | test=pipe1 | > +------------------+----------------------------------------------------------------------------------------------------+ > > > If you fix the issue in a separate patch/commit (i.e. not just a new version of > the same patch/commit), kindly add following tags > > Reported-by: kernel test robot <oliver.sang@xxxxxxxxx> > > Closes: https://lore.kernel.org/oe-lkp/202409091303.31b2b713-oliver.sang@xxxxxxxxx > > It's not too surprising that some of these microbenchmarks might regress, given that we're adding some atomic ops in the write codepath. That said, if this the commit it landed on, then this is before any filesystems have enabled multigrain timestamps. Most of the new stuff we're adding here is dead code at this point. The main difference is that we're fetching the ctime_floor value when calling current_time (and doing the ktime_t comparison). static ktime_t coarse_ctime(ktime_t floor) { ktime_t coarse = ktime_get_coarse(); /* If coarse time is already newer, return that */ if (!ktime_after(floor, coarse)) return ktime_get_coarse_real(); return ktime_mono_to_real(floor); } I forget who it was that suggested changing it, but originally this patch just did this when the coarse time was after the floor: return ktime_mono_to_real(coarse); I wonder if that might shave off a few cycles? The other possibility is the is_mgtime check. That has to do some pointer chasing to get at the fs_flags field. If that's too costly, we could look at adding a flag to the inode that mirrors that value. I'll see if I can reproduce this. > Details are as below: > --------------------------------------------------------------------------------------------------> > > > The kernel config and materials to reproduce are available at: > https://download.01.org/0day-ci/archive/20240909/202409091303.31b2b713-oliver.sang@xxxxxxxxx > > ========================================================================================= > compiler/cpufreq_governor/kconfig/mode/nr_task/rootfs/tbox_group/test/testcase: > gcc-12/performance/x86_64-rhel-8.3/thread/100%/debian-12-x86_64-20240206.cgz/lkp-icl-2sp7/pipe1/will-it-scale > > commit: > v6.11-rc1 > a037d5e7f8 ("fs: add infrastructure for multigrain timestamps") > > v6.11-rc1 a037d5e7f81bae8ff69eb670b2e > ---------------- --------------------------- > %stddev %change %stddev > \ | \ > 98752 +1.3% 100032 proc-vmstat.nr_active_anon > 103972 +1.2% 105220 proc-vmstat.nr_shmem > 98752 +1.3% 100032 proc-vmstat.nr_zone_active_anon > 84330110 -5.5% 79683588 will-it-scale.64.threads > 1317657 -5.5% 1245055 will-it-scale.per_thread_ops > 84330110 -5.5% 79683588 will-it-scale.workload > 4.678e+10 +1.2% 4.733e+10 perf-stat.i.branch-instructions > 0.03 ± 7% -0.0 0.02 ± 6% perf-stat.i.branch-miss-rate% > 12781080 ± 7% -19.2% 10321748 ± 6% perf-stat.i.branch-misses > 1.01 -2.3% 0.99 perf-stat.i.cpi > 1.946e+11 +2.4% 1.993e+11 perf-stat.i.instructions > 0.99 +2.4% 1.01 perf-stat.i.ipc > 0.03 ± 7% -0.0 0.02 ± 6% perf-stat.overall.branch-miss-rate% > 1.01 -2.4% 0.99 perf-stat.overall.cpi > 0.99 +2.4% 1.01 perf-stat.overall.ipc > 695016 +8.4% 753440 perf-stat.overall.path-length > 4.661e+10 +1.2% 4.717e+10 perf-stat.ps.branch-instructions > 12713767 ± 7% -19.3% 10263468 ± 6% perf-stat.ps.branch-misses > 1.939e+11 +2.4% 1.986e+11 perf-stat.ps.instructions > 5.861e+13 +2.4% 6.004e+13 perf-stat.total.instructions > 7.03 -0.4 6.60 perf-profile.calltrace.cycles-pp.clear_bhb_loop.write > 6.89 -0.4 6.48 perf-profile.calltrace.cycles-pp.clear_bhb_loop.read > 6.72 -0.4 6.35 perf-profile.calltrace.cycles-pp.copy_page_from_iter.pipe_write.vfs_write.ksys_write.do_syscall_64 > 5.69 -0.3 5.37 perf-profile.calltrace.cycles-pp.entry_SYSCALL_64.read > 5.46 -0.3 5.17 perf-profile.calltrace.cycles-pp._copy_from_iter.copy_page_from_iter.pipe_write.vfs_write.ksys_write > 5.68 -0.3 5.40 perf-profile.calltrace.cycles-pp.entry_SYSCALL_64.write > 5.30 -0.3 5.03 perf-profile.calltrace.cycles-pp.copy_page_to_iter.pipe_read.vfs_read.ksys_read.do_syscall_64 > 49.56 -0.2 49.34 perf-profile.calltrace.cycles-pp.read > 4.39 -0.2 4.17 perf-profile.calltrace.cycles-pp._copy_to_iter.copy_page_to_iter.pipe_read.vfs_read.ksys_read > 0.56 -0.1 0.43 ± 50% perf-profile.calltrace.cycles-pp.aa_file_perm.apparmor_file_permission.security_file_permission.rw_verify_area.vfs_write > 2.66 -0.1 2.52 perf-profile.calltrace.cycles-pp.__wake_up_sync_key.pipe_write.vfs_write.ksys_write.do_syscall_64 > 1.71 -0.1 1.61 perf-profile.calltrace.cycles-pp._raw_spin_lock_irqsave.__wake_up_sync_key.pipe_write.vfs_write.ksys_write > 1.81 -0.1 1.72 perf-profile.calltrace.cycles-pp.mutex_lock.pipe_read.vfs_read.ksys_read.do_syscall_64 > 1.77 -0.1 1.69 perf-profile.calltrace.cycles-pp.mutex_lock.pipe_write.vfs_write.ksys_write.do_syscall_64 > 1.04 -0.1 0.97 perf-profile.calltrace.cycles-pp.fput.ksys_read.do_syscall_64.entry_SYSCALL_64_after_hwframe.read > 0.86 -0.1 0.80 perf-profile.calltrace.cycles-pp.testcase > 1.11 -0.1 1.05 perf-profile.calltrace.cycles-pp.mutex_unlock.pipe_read.vfs_read.ksys_read.do_syscall_64 > 1.12 -0.1 1.06 perf-profile.calltrace.cycles-pp.mutex_unlock.pipe_write.vfs_write.ksys_write.do_syscall_64 > 0.88 -0.0 0.84 perf-profile.calltrace.cycles-pp.syscall_return_via_sysret.read > 1.04 -0.0 1.00 perf-profile.calltrace.cycles-pp.fput.ksys_write.do_syscall_64.entry_SYSCALL_64_after_hwframe.write > 0.72 -0.0 0.68 perf-profile.calltrace.cycles-pp.x64_sys_call.do_syscall_64.entry_SYSCALL_64_after_hwframe.read > 0.65 -0.0 0.61 perf-profile.calltrace.cycles-pp.x64_sys_call.do_syscall_64.entry_SYSCALL_64_after_hwframe.write > 0.55 -0.0 0.52 ± 2% perf-profile.calltrace.cycles-pp.aa_file_perm.apparmor_file_permission.security_file_permission.rw_verify_area.vfs_read > 0.58 -0.0 0.55 ± 2% perf-profile.calltrace.cycles-pp.__cond_resched.mutex_lock.pipe_write.vfs_write.ksys_write > 53.26 +0.1 53.34 perf-profile.calltrace.cycles-pp.write > 0.00 +0.6 0.56 ± 5% perf-profile.calltrace.cycles-pp.ktime_get_coarse_ts64.coarse_ctime.current_time.atime_needs_update.touch_atime > 0.00 +0.6 0.64 perf-profile.calltrace.cycles-pp.timestamp_truncate.current_time.atime_needs_update.touch_atime.pipe_read > 0.00 +0.6 0.64 perf-profile.calltrace.cycles-pp.timestamp_truncate.current_time.inode_needs_update_time.file_update_time.pipe_write > 32.72 +0.8 33.51 perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.read > 31.75 +0.8 32.58 perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.read > 36.32 +1.0 37.27 perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.write > 28.50 +1.0 29.49 perf-profile.calltrace.cycles-pp.ksys_read.do_syscall_64.entry_SYSCALL_64_after_hwframe.read > 35.33 +1.0 36.34 perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.write > 0.00 +1.1 1.08 ± 2% perf-profile.calltrace.cycles-pp.coarse_ctime.current_time.inode_needs_update_time.file_update_time.pipe_write > 32.09 +1.1 33.20 perf-profile.calltrace.cycles-pp.ksys_write.do_syscall_64.entry_SYSCALL_64_after_hwframe.write > 0.00 +1.2 1.23 perf-profile.calltrace.cycles-pp.coarse_ctime.current_time.atime_needs_update.touch_atime.pipe_read > 23.02 +1.3 24.30 perf-profile.calltrace.cycles-pp.vfs_read.ksys_read.do_syscall_64.entry_SYSCALL_64_after_hwframe.read > 26.62 +1.4 27.97 perf-profile.calltrace.cycles-pp.vfs_write.ksys_write.do_syscall_64.entry_SYSCALL_64_after_hwframe.write > 19.03 +1.6 20.60 perf-profile.calltrace.cycles-pp.pipe_write.vfs_write.ksys_write.do_syscall_64.entry_SYSCALL_64_after_hwframe > 15.11 +1.6 16.71 perf-profile.calltrace.cycles-pp.pipe_read.vfs_read.ksys_read.do_syscall_64.entry_SYSCALL_64_after_hwframe > 1.48 +2.1 3.56 perf-profile.calltrace.cycles-pp.current_time.atime_needs_update.touch_atime.pipe_read.vfs_read > 3.51 +2.2 5.72 perf-profile.calltrace.cycles-pp.touch_atime.pipe_read.vfs_read.ksys_read.do_syscall_64 > 3.05 +2.2 5.28 perf-profile.calltrace.cycles-pp.atime_needs_update.touch_atime.pipe_read.vfs_read.ksys_read > 1.81 ± 2% +2.4 4.20 perf-profile.calltrace.cycles-pp.inode_needs_update_time.file_update_time.pipe_write.vfs_write.ksys_write > 2.23 ± 2% +2.5 4.71 perf-profile.calltrace.cycles-pp.file_update_time.pipe_write.vfs_write.ksys_write.do_syscall_64 > 0.00 +3.3 3.26 perf-profile.calltrace.cycles-pp.current_time.inode_needs_update_time.file_update_time.pipe_write.vfs_write > 14.06 -0.8 13.22 perf-profile.children.cycles-pp.clear_bhb_loop > 7.01 -0.4 6.62 perf-profile.children.cycles-pp.copy_page_from_iter > 6.63 -0.4 6.28 perf-profile.children.cycles-pp.entry_SYSCALL_64 > 5.61 -0.3 5.32 perf-profile.children.cycles-pp._copy_from_iter > 5.46 -0.3 5.18 perf-profile.children.cycles-pp.copy_page_to_iter > 5.00 -0.3 4.73 perf-profile.children.cycles-pp.entry_SYSRETQ_unsafe_stack > 50.00 -0.2 49.78 perf-profile.children.cycles-pp.read > 4.46 -0.2 4.24 perf-profile.children.cycles-pp._copy_to_iter > 3.88 -0.2 3.69 perf-profile.children.cycles-pp.mutex_lock > 2.87 -0.1 2.73 perf-profile.children.cycles-pp.__wake_up_sync_key > 2.38 -0.1 2.24 perf-profile.children.cycles-pp.mutex_unlock > 2.23 -0.1 2.10 perf-profile.children.cycles-pp.fput > 1.81 -0.1 1.71 perf-profile.children.cycles-pp._raw_spin_lock_irqsave > 1.53 -0.1 1.44 perf-profile.children.cycles-pp.x64_sys_call > 1.10 -0.1 1.01 perf-profile.children.cycles-pp.testcase > 1.26 -0.1 1.19 ± 2% perf-profile.children.cycles-pp.aa_file_perm > 1.48 -0.1 1.41 perf-profile.children.cycles-pp.__cond_resched > 2.14 -0.1 2.09 perf-profile.children.cycles-pp.syscall_exit_to_user_mode > 0.35 -0.0 0.32 ± 2% perf-profile.children.cycles-pp.entry_SYSCALL_64_safe_stack > 0.57 -0.0 0.54 perf-profile.children.cycles-pp.__wake_up_common > 0.51 -0.0 0.48 perf-profile.children.cycles-pp.kill_fasync > 0.26 -0.0 0.23 ± 2% perf-profile.children.cycles-pp.__x64_sys_read > 0.49 -0.0 0.47 perf-profile.children.cycles-pp._raw_spin_unlock_irqrestore > 0.25 -0.0 0.23 perf-profile.children.cycles-pp.__x64_sys_write > 0.30 -0.0 0.28 perf-profile.children.cycles-pp.make_vfsuid > 0.23 -0.0 0.22 ± 2% perf-profile.children.cycles-pp.write@plt > 0.00 +0.7 0.65 perf-profile.children.cycles-pp.set_normalized_timespec64 > 0.71 +0.7 1.43 perf-profile.children.cycles-pp.timestamp_truncate > 0.00 +0.9 0.93 perf-profile.children.cycles-pp.ns_to_timespec64 > 28.79 +1.0 29.76 perf-profile.children.cycles-pp.ksys_read > 0.00 +1.0 1.04 perf-profile.children.cycles-pp.ktime_get_coarse_with_offset > 32.43 +1.1 33.54 perf-profile.children.cycles-pp.ksys_write > 0.00 +1.2 1.15 ± 3% perf-profile.children.cycles-pp.ktime_get_coarse_ts64 > 23.24 +1.3 24.52 perf-profile.children.cycles-pp.vfs_read > 26.88 +1.3 28.22 perf-profile.children.cycles-pp.vfs_write > 19.69 +1.5 21.23 perf-profile.children.cycles-pp.pipe_write > 15.78 +1.6 17.35 perf-profile.children.cycles-pp.pipe_read > 69.41 +1.7 71.14 perf-profile.children.cycles-pp.entry_SYSCALL_64_after_hwframe > 67.61 +1.8 69.42 perf-profile.children.cycles-pp.do_syscall_64 > 3.65 +2.2 5.86 perf-profile.children.cycles-pp.touch_atime > 3.34 +2.2 5.56 perf-profile.children.cycles-pp.atime_needs_update > 2.08 ± 2% +2.3 4.35 perf-profile.children.cycles-pp.inode_needs_update_time > 2.45 ± 2% +2.4 4.85 perf-profile.children.cycles-pp.file_update_time > 0.00 +2.7 2.72 perf-profile.children.cycles-pp.coarse_ctime > 1.63 +5.9 7.55 perf-profile.children.cycles-pp.current_time > 13.92 -0.8 13.07 perf-profile.self.cycles-pp.clear_bhb_loop > 1.04 ± 3% -0.3 0.73 perf-profile.self.cycles-pp.inode_needs_update_time > 5.38 -0.3 5.09 perf-profile.self.cycles-pp._copy_from_iter > 4.83 -0.3 4.56 perf-profile.self.cycles-pp.entry_SYSRETQ_unsafe_stack > 4.55 -0.2 4.33 perf-profile.self.cycles-pp.vfs_read > 4.37 -0.2 4.15 perf-profile.self.cycles-pp._copy_to_iter > 3.34 -0.2 3.14 perf-profile.self.cycles-pp.write > 3.21 -0.2 3.02 perf-profile.self.cycles-pp.pipe_read > 3.34 -0.2 3.15 perf-profile.self.cycles-pp.read > 2.06 -0.1 1.93 perf-profile.self.cycles-pp.fput > 4.43 -0.1 4.30 perf-profile.self.cycles-pp.vfs_write > 2.21 -0.1 2.09 perf-profile.self.cycles-pp.mutex_unlock > 2.54 -0.1 2.42 perf-profile.self.cycles-pp.do_syscall_64 > 2.37 -0.1 2.27 perf-profile.self.cycles-pp.mutex_lock > 1.88 -0.1 1.78 perf-profile.self.cycles-pp.entry_SYSCALL_64 > 1.73 -0.1 1.64 perf-profile.self.cycles-pp._raw_spin_lock_irqsave > 1.46 -0.1 1.37 perf-profile.self.cycles-pp.copy_page_from_iter > 1.79 -0.1 1.71 perf-profile.self.cycles-pp.entry_SYSCALL_64_after_hwframe > 1.36 -0.1 1.28 perf-profile.self.cycles-pp.x64_sys_call > 1.33 -0.1 1.26 perf-profile.self.cycles-pp.security_file_permission > 0.87 -0.1 0.80 perf-profile.self.cycles-pp.testcase > 1.05 -0.1 0.99 perf-profile.self.cycles-pp.rw_verify_area > 0.99 -0.1 0.93 perf-profile.self.cycles-pp.copy_page_to_iter > 1.05 -0.1 0.99 perf-profile.self.cycles-pp.aa_file_perm > 1.27 -0.1 1.21 perf-profile.self.cycles-pp.atime_needs_update > 0.83 -0.1 0.78 perf-profile.self.cycles-pp.__cond_resched > 1.38 -0.0 1.34 perf-profile.self.cycles-pp.syscall_exit_to_user_mode > 0.34 ± 2% -0.0 0.31 ± 2% perf-profile.self.cycles-pp.entry_SYSCALL_64_safe_stack > 0.50 -0.0 0.48 perf-profile.self.cycles-pp.__wake_up_sync_key > 0.18 -0.0 0.16 ± 2% perf-profile.self.cycles-pp.__x64_sys_read > 0.17 -0.0 0.16 perf-profile.self.cycles-pp.__x64_sys_write > 0.43 +0.1 0.49 perf-profile.self.cycles-pp.file_update_time > 0.00 +0.5 0.51 perf-profile.self.cycles-pp.set_normalized_timespec64 > 0.58 +0.6 1.15 perf-profile.self.cycles-pp.timestamp_truncate > 0.00 +0.8 0.78 perf-profile.self.cycles-pp.ns_to_timespec64 > 0.00 +0.9 0.88 ± 4% perf-profile.self.cycles-pp.ktime_get_coarse_ts64 > 0.00 +0.9 0.89 ± 2% perf-profile.self.cycles-pp.ktime_get_coarse_with_offset > 1.07 +0.9 1.98 ± 2% perf-profile.self.cycles-pp.current_time > 0.00 +1.2 1.17 perf-profile.self.cycles-pp.coarse_ctime > > > *************************************************************************************************** > lkp-icl-2sp7: 64 threads 2 sockets Intel(R) Xeon(R) Gold 6346 CPU @ 3.10GHz (Ice Lake) with 256G memory > ========================================================================================= > compiler/cpufreq_governor/kconfig/mode/nr_task/rootfs/tbox_group/test/testcase: > gcc-12/performance/x86_64-rhel-8.3/thread/100%/debian-12-x86_64-20240206.cgz/lkp-icl-2sp7/writeseek1/will-it-scale > > commit: > v6.11-rc1 > a037d5e7f8 ("fs: add infrastructure for multigrain timestamps") > > v6.11-rc1 a037d5e7f81bae8ff69eb670b2e > ---------------- --------------------------- > %stddev %change %stddev > \ | \ > 79.00 ± 9% +38.8% 109.67 ± 20% perf-c2c.HITM.remote > 75816166 -2.4% 73999365 will-it-scale.64.threads > 1184627 -2.4% 1156239 will-it-scale.per_thread_ops > 75816166 -2.4% 73999365 will-it-scale.workload > 1.29 ± 15% +32.0% 1.70 ± 15% perf-sched.sch_delay.avg.ms.__cond_resched.vfs_write.ksys_write.do_syscall_64.entry_SYSCALL_64_after_hwframe > 5.46 ±221% -99.7% 0.02 ± 15% perf-sched.sch_delay.max.ms.schedule_timeout.rcu_gp_fqs_loop.rcu_gp_kthread.kthread > 2.57 ± 15% +32.0% 3.40 ± 15% perf-sched.wait_and_delay.avg.ms.__cond_resched.vfs_write.ksys_write.do_syscall_64.entry_SYSCALL_64_after_hwframe > 846.14 ± 37% -64.5% 300.69 ±117% perf-sched.wait_and_delay.avg.ms.devkmsg_read.vfs_read.ksys_read.do_syscall_64 > 1.29 ± 15% +32.0% 1.70 ± 15% perf-sched.wait_time.avg.ms.__cond_resched.vfs_write.ksys_write.do_syscall_64.entry_SYSCALL_64_after_hwframe > 839.24 ± 37% -64.4% 298.49 ±116% perf-sched.wait_time.avg.ms.devkmsg_read.vfs_read.ksys_read.do_syscall_64 > 8283953 +18.7% 9832101 ± 6% perf-stat.i.branch-misses > 1.13 -1.7% 1.11 perf-stat.i.cpi > 1.746e+11 +1.7% 1.775e+11 perf-stat.i.instructions > 0.89 +1.7% 0.90 perf-stat.i.ipc > 0.02 +0.0 0.02 ± 6% perf-stat.overall.branch-miss-rate% > 1.13 -1.6% 1.11 perf-stat.overall.cpi > 0.89 +1.6% 0.90 perf-stat.overall.ipc > 696240 +4.1% 725044 perf-stat.overall.path-length > 8247837 +18.6% 9784317 ± 6% perf-stat.ps.branch-misses > 1.74e+11 +1.7% 1.769e+11 perf-stat.ps.instructions > 5.279e+13 +1.6% 5.365e+13 perf-stat.total.instructions > 26.93 -0.8 26.10 perf-profile.calltrace.cycles-pp.llseek > 31.20 -0.6 30.58 perf-profile.calltrace.cycles-pp.generic_perform_write.shmem_file_write_iter.vfs_write.ksys_write.do_syscall_64 > 12.61 -0.5 12.12 ± 2% perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.llseek > 11.70 ± 2% -0.5 11.23 ± 2% perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.llseek > 8.91 ± 2% -0.4 8.50 ± 3% perf-profile.calltrace.cycles-pp.ksys_lseek.do_syscall_64.entry_SYSCALL_64_after_hwframe.llseek > 13.72 -0.3 13.38 perf-profile.calltrace.cycles-pp.copy_page_from_iter_atomic.generic_perform_write.shmem_file_write_iter.vfs_write.ksys_write > 6.36 -0.2 6.16 perf-profile.calltrace.cycles-pp.clear_bhb_loop.llseek > 6.60 -0.2 6.42 perf-profile.calltrace.cycles-pp.shmem_write_begin.generic_perform_write.shmem_file_write_iter.vfs_write.ksys_write > 5.69 -0.2 5.52 perf-profile.calltrace.cycles-pp.shmem_get_folio_gfp.shmem_write_begin.generic_perform_write.shmem_file_write_iter.vfs_write > 5.85 -0.1 5.72 perf-profile.calltrace.cycles-pp.clear_bhb_loop.write > 5.05 -0.1 4.93 perf-profile.calltrace.cycles-pp.entry_SYSCALL_64.write > 4.86 -0.1 4.75 perf-profile.calltrace.cycles-pp.entry_SYSCALL_64.llseek > 3.23 -0.1 3.14 perf-profile.calltrace.cycles-pp.shmem_write_end.generic_perform_write.shmem_file_write_iter.vfs_write.ksys_write > 2.23 -0.1 2.17 perf-profile.calltrace.cycles-pp.filemap_get_entry.shmem_get_folio_gfp.shmem_write_begin.generic_perform_write.shmem_file_write_iter > 1.55 -0.0 1.51 perf-profile.calltrace.cycles-pp.down_write.shmem_file_write_iter.vfs_write.ksys_write.do_syscall_64 > 1.52 -0.0 1.49 perf-profile.calltrace.cycles-pp.mutex_lock.__fdget_pos.ksys_write.do_syscall_64.entry_SYSCALL_64_after_hwframe > 0.98 -0.0 0.95 perf-profile.calltrace.cycles-pp.fput.ksys_lseek.do_syscall_64.entry_SYSCALL_64_after_hwframe.llseek > 1.56 -0.0 1.52 perf-profile.calltrace.cycles-pp.mutex_lock.__fdget_pos.ksys_lseek.do_syscall_64.entry_SYSCALL_64_after_hwframe > 0.95 -0.0 0.92 perf-profile.calltrace.cycles-pp.folio_unlock.shmem_write_end.generic_perform_write.shmem_file_write_iter.vfs_write > 1.06 -0.0 1.03 perf-profile.calltrace.cycles-pp.mutex_unlock.ksys_lseek.do_syscall_64.entry_SYSCALL_64_after_hwframe.llseek > 0.84 -0.0 0.82 perf-profile.calltrace.cycles-pp.folio_mark_accessed.shmem_get_folio_gfp.shmem_write_begin.generic_perform_write.shmem_file_write_iter > 0.64 -0.0 0.62 perf-profile.calltrace.cycles-pp.folio_mark_dirty.shmem_write_end.generic_perform_write.shmem_file_write_iter.vfs_write > 0.76 -0.0 0.74 perf-profile.calltrace.cycles-pp.syscall_exit_to_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe.llseek > 0.00 +0.6 0.57 perf-profile.calltrace.cycles-pp.timestamp_truncate.current_time.inode_needs_update_time.file_update_time.shmem_file_write_iter > 75.78 +0.8 76.57 perf-profile.calltrace.cycles-pp.write > 61.25 +1.0 62.22 perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.write > 60.37 +1.0 61.35 perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.write > 57.56 +1.0 58.61 perf-profile.calltrace.cycles-pp.ksys_write.do_syscall_64.entry_SYSCALL_64_after_hwframe.write > 0.00 +1.1 1.11 ± 3% perf-profile.calltrace.cycles-pp.coarse_ctime.current_time.inode_needs_update_time.file_update_time.shmem_file_write_iter > 49.83 +1.2 51.07 perf-profile.calltrace.cycles-pp.vfs_write.ksys_write.do_syscall_64.entry_SYSCALL_64_after_hwframe.write > 39.42 +1.4 40.86 perf-profile.calltrace.cycles-pp.shmem_file_write_iter.vfs_write.ksys_write.do_syscall_64.entry_SYSCALL_64_after_hwframe > 1.95 +2.1 4.04 perf-profile.calltrace.cycles-pp.inode_needs_update_time.file_update_time.shmem_file_write_iter.vfs_write.ksys_write > 2.33 +2.2 4.54 perf-profile.calltrace.cycles-pp.file_update_time.shmem_file_write_iter.vfs_write.ksys_write.do_syscall_64 > 0.00 +3.1 3.14 perf-profile.calltrace.cycles-pp.current_time.inode_needs_update_time.file_update_time.shmem_file_write_iter.vfs_write > 27.11 -0.9 26.24 perf-profile.children.cycles-pp.llseek > 31.79 -0.6 31.15 perf-profile.children.cycles-pp.generic_perform_write > 9.30 ± 2% -0.4 8.87 ± 2% perf-profile.children.cycles-pp.ksys_lseek > 13.81 -0.3 13.47 perf-profile.children.cycles-pp.copy_page_from_iter_atomic > 12.33 -0.3 11.99 perf-profile.children.cycles-pp.clear_bhb_loop > 6.76 -0.2 6.56 perf-profile.children.cycles-pp.shmem_write_begin > 5.95 -0.2 5.77 perf-profile.children.cycles-pp.shmem_get_folio_gfp > 5.77 -0.1 5.64 perf-profile.children.cycles-pp.entry_SYSCALL_64 > 3.43 -0.1 3.34 perf-profile.children.cycles-pp.shmem_write_end > 3.51 -0.1 3.42 perf-profile.children.cycles-pp.__cond_resched > 3.34 -0.1 3.26 perf-profile.children.cycles-pp.mutex_lock > 2.37 -0.1 2.30 perf-profile.children.cycles-pp.filemap_get_entry > 2.01 -0.1 1.96 perf-profile.children.cycles-pp.fput > 1.57 -0.1 1.52 perf-profile.children.cycles-pp.rcu_all_qs > 1.68 -0.0 1.63 perf-profile.children.cycles-pp.down_write > 2.15 -0.0 2.10 perf-profile.children.cycles-pp.mutex_unlock > 0.11 ± 4% -0.0 0.07 ± 11% perf-profile.children.cycles-pp.ktime_get_update_offsets_now > 1.76 -0.0 1.72 perf-profile.children.cycles-pp.syscall_exit_to_user_mode > 1.02 -0.0 0.98 perf-profile.children.cycles-pp.folio_unlock > 0.89 -0.0 0.87 perf-profile.children.cycles-pp.folio_mark_accessed > 0.55 -0.0 0.52 perf-profile.children.cycles-pp.shmem_file_llseek > 0.77 -0.0 0.75 perf-profile.children.cycles-pp.folio_mark_dirty > 0.26 -0.0 0.25 perf-profile.children.cycles-pp.__f_unlock_pos > 0.22 +0.0 0.24 perf-profile.children.cycles-pp.inode_to_bdi > 0.00 +0.3 0.30 perf-profile.children.cycles-pp.set_normalized_timespec64 > 0.00 +0.4 0.42 ± 3% perf-profile.children.cycles-pp.ns_to_timespec64 > 74.20 +0.5 74.66 perf-profile.children.cycles-pp.entry_SYSCALL_64_after_hwframe > 0.00 +0.5 0.47 perf-profile.children.cycles-pp.ktime_get_coarse_with_offset > 72.54 +0.5 73.04 perf-profile.children.cycles-pp.do_syscall_64 > 0.00 +0.6 0.56 ± 6% perf-profile.children.cycles-pp.ktime_get_coarse_ts64 > 76.24 +0.8 77.00 perf-profile.children.cycles-pp.write > 57.97 +1.0 59.02 perf-profile.children.cycles-pp.ksys_write > 50.19 +1.2 51.42 perf-profile.children.cycles-pp.vfs_write > 0.00 +1.3 1.29 ± 2% perf-profile.children.cycles-pp.coarse_ctime > 39.96 +1.4 41.40 perf-profile.children.cycles-pp.shmem_file_write_iter > 2.18 +2.0 4.17 perf-profile.children.cycles-pp.inode_needs_update_time > 2.52 +2.1 4.67 perf-profile.children.cycles-pp.file_update_time > 0.00 +3.5 3.46 perf-profile.children.cycles-pp.current_time > 1.13 -0.4 0.72 perf-profile.self.cycles-pp.inode_needs_update_time > 13.62 -0.3 13.28 perf-profile.self.cycles-pp.copy_page_from_iter_atomic > 12.19 -0.3 11.86 perf-profile.self.cycles-pp.clear_bhb_loop > 2.31 -0.1 2.24 perf-profile.self.cycles-pp.llseek > 4.25 -0.1 4.19 perf-profile.self.cycles-pp.entry_SYSRETQ_unsafe_stack > 2.18 -0.1 2.12 perf-profile.self.cycles-pp.shmem_get_folio_gfp > 2.18 -0.1 2.12 perf-profile.self.cycles-pp.do_syscall_64 > 1.72 -0.1 1.67 perf-profile.self.cycles-pp.filemap_get_entry > 1.88 -0.0 1.83 perf-profile.self.cycles-pp.fput > 2.14 -0.0 2.09 perf-profile.self.cycles-pp.mutex_lock > 2.01 -0.0 1.96 perf-profile.self.cycles-pp.mutex_unlock > 1.66 -0.0 1.61 perf-profile.self.cycles-pp.entry_SYSCALL_64_after_hwframe > 1.18 -0.0 1.14 perf-profile.self.cycles-pp.rcu_all_qs > 0.54 ± 2% -0.0 0.50 perf-profile.self.cycles-pp.timestamp_truncate > 1.94 -0.0 1.90 perf-profile.self.cycles-pp.__cond_resched > 1.09 -0.0 1.05 perf-profile.self.cycles-pp.down_write > 1.56 -0.0 1.52 perf-profile.self.cycles-pp.shmem_write_end > 0.10 ± 4% -0.0 0.07 ± 11% perf-profile.self.cycles-pp.ktime_get_update_offsets_now > 1.62 -0.0 1.59 perf-profile.self.cycles-pp.entry_SYSCALL_64 > 2.68 -0.0 2.65 perf-profile.self.cycles-pp.__fsnotify_parent > 1.13 -0.0 1.10 perf-profile.self.cycles-pp.syscall_exit_to_user_mode > 0.95 -0.0 0.92 perf-profile.self.cycles-pp.folio_unlock > 0.83 -0.0 0.81 perf-profile.self.cycles-pp.folio_mark_accessed > 0.42 -0.0 0.41 perf-profile.self.cycles-pp.shmem_file_llseek > 0.49 -0.0 0.47 perf-profile.self.cycles-pp.balance_dirty_pages_ratelimited_flags > 0.34 -0.0 0.33 perf-profile.self.cycles-pp.folio_mapping > 0.15 ± 3% +0.0 0.17 ± 2% perf-profile.self.cycles-pp.inode_to_bdi > 0.39 +0.1 0.48 perf-profile.self.cycles-pp.file_update_time > 0.00 +0.2 0.24 ± 3% perf-profile.self.cycles-pp.set_normalized_timespec64 > 0.00 +0.4 0.35 perf-profile.self.cycles-pp.ns_to_timespec64 > 0.00 +0.4 0.41 ± 2% perf-profile.self.cycles-pp.ktime_get_coarse_with_offset > 0.00 +0.4 0.44 ± 8% perf-profile.self.cycles-pp.ktime_get_coarse_ts64 > 0.00 +0.6 0.55 perf-profile.self.cycles-pp.coarse_ctime > 0.00 +0.9 0.89 perf-profile.self.cycles-pp.current_time > > > > *************************************************************************************************** > lkp-cpl-4sp2: 224 threads 4 sockets Intel(R) Xeon(R) Platinum 8380H CPU @ 2.90GHz (Cooper Lake) with 192G memory > ========================================================================================= > compiler/cpufreq_governor/kconfig/mode/nr_task/rootfs/tbox_group/test/testcase: > gcc-12/performance/x86_64-rhel-8.3/thread/100%/debian-12-x86_64-20240206.cgz/lkp-cpl-4sp2/pipe1/will-it-scale > > commit: > v6.11-rc1 > a037d5e7f8 ("fs: add infrastructure for multigrain timestamps") > > v6.11-rc1 a037d5e7f81bae8ff69eb670b2e > ---------------- --------------------------- > %stddev %change %stddev > \ | \ > 101460 ± 14% -40.5% 60365 ± 54% numa-numastat.node2.other_node > 101460 ± 14% -40.5% 60365 ± 54% numa-vmstat.node2.numa_other > 214956 +1.3% 217796 proc-vmstat.nr_shmem > 2.864e+08 -5.5% 2.706e+08 will-it-scale.224.threads > 1278568 -5.5% 1207975 will-it-scale.per_thread_ops > 2.864e+08 -5.5% 2.706e+08 will-it-scale.workload > 0.29 ±220% +486.6% 1.70 ± 71% perf-sched.sch_delay.avg.ms.schedule_hrtimeout_range_clock.ep_poll.do_epoll_wait.__x64_sys_epoll_wait > 0.14 ±104% +238.3% 0.49 ± 26% perf-sched.sch_delay.avg.ms.wait_for_partner.fifo_open.do_dentry_open.vfs_open > 4.33 ±196% +426.6% 22.79 ± 55% perf-sched.sch_delay.max.ms.devkmsg_read.vfs_read.ksys_read.do_syscall_64 > 15.03 ±101% +153.2% 38.05 ± 19% perf-sched.total_sch_delay.max.ms > 0.07 ±141% +201.5% 0.20 perf-sched.wait_and_delay.avg.ms.schedule_hrtimeout_range_clock.usleep_range_state.ipmi_thread.kthread > 0.70 ±100% +137.6% 1.66 ± 20% perf-sched.wait_and_delay.avg.ms.schedule_timeout.__wait_for_common.wait_for_completion_state.kernel_clone > 35.83 ±143% +476.3% 206.50 ± 56% perf-sched.wait_and_delay.count.schedule_hrtimeout_range_clock.usleep_range_state.ipmi_thread.kthread > 1.40 ±100% +116.9% 3.03 ± 6% perf-sched.wait_time.avg.ms.rcu_gp_kthread.kthread.ret_from_fork.ret_from_fork_asm > 2.43 ±143% +472.3% 13.90 ± 56% perf-sched.wait_time.avg.ms.schedule_hrtimeout_range_clock.do_select.core_sys_select.do_pselect.constprop > 0.64 ±100% +137.0% 1.51 ± 20% perf-sched.wait_time.avg.ms.schedule_timeout.__wait_for_common.wait_for_completion_state.kernel_clone > 0.14 ±104% +229.2% 0.47 ± 25% perf-sched.wait_time.avg.ms.wait_for_partner.fifo_open.do_dentry_open.vfs_open > 2.33 ±101% +114.3% 4.99 perf-sched.wait_time.max.ms.rcu_gp_kthread.kthread.ret_from_fork.ret_from_fork_asm > 4.45 ±143% +600.8% 31.18 ± 64% perf-sched.wait_time.max.ms.schedule_hrtimeout_range_clock.do_select.core_sys_select.do_pselect.constprop > 1.612e+11 +1.6% 1.638e+11 perf-stat.i.branch-instructions > 0.42 ± 2% -0.1 0.37 perf-stat.i.branch-miss-rate% > 6.629e+08 -9.0% 6.034e+08 perf-stat.i.branch-misses > 13.77 ± 3% +1.3 15.04 ± 6% perf-stat.i.cache-miss-rate% > 1.12 -2.6% 1.09 perf-stat.i.cpi > 6.635e+11 +2.9% 6.827e+11 perf-stat.i.instructions > 0.89 +2.7% 0.91 perf-stat.i.ipc > 0.41 -0.0 0.37 perf-stat.overall.branch-miss-rate% > 1.12 -2.6% 1.09 perf-stat.overall.cpi > 0.89 +2.6% 0.91 perf-stat.overall.ipc > 702093 +8.3% 760166 perf-stat.overall.path-length > 1.604e+11 +1.7% 1.631e+11 perf-stat.ps.branch-instructions > 6.591e+08 -8.9% 6.005e+08 perf-stat.ps.branch-misses > 6.6e+11 +2.9% 6.795e+11 perf-stat.ps.instructions > 2.011e+14 +2.3% 2.057e+14 perf-stat.total.instructions > 5.47 -0.4 5.04 perf-profile.calltrace.cycles-pp.copy_page_from_iter.pipe_write.vfs_write.ksys_write.do_syscall_64 > 6.78 -0.4 6.37 perf-profile.calltrace.cycles-pp.clear_bhb_loop.write > 6.84 -0.3 6.51 perf-profile.calltrace.cycles-pp.clear_bhb_loop.read > 2.72 ± 2% -0.3 2.41 ± 8% perf-profile.calltrace.cycles-pp.apparmor_file_permission.security_file_permission.rw_verify_area.vfs_read.ksys_read > 5.38 -0.3 5.08 perf-profile.calltrace.cycles-pp.entry_SYSCALL_64.read > 5.36 -0.3 5.08 perf-profile.calltrace.cycles-pp.entry_SYSCALL_64.write > 3.98 -0.3 3.70 perf-profile.calltrace.cycles-pp._copy_from_iter.copy_page_from_iter.pipe_write.vfs_write.ksys_write > 4.16 -0.2 3.96 perf-profile.calltrace.cycles-pp.copy_page_to_iter.pipe_read.vfs_read.ksys_read.do_syscall_64 > 52.78 -0.2 52.59 perf-profile.calltrace.cycles-pp.write > 2.09 -0.1 1.95 perf-profile.calltrace.cycles-pp.__wake_up_sync_key.pipe_write.vfs_write.ksys_write.do_syscall_64 > 1.04 -0.1 0.90 perf-profile.calltrace.cycles-pp.mutex_unlock.pipe_write.vfs_write.ksys_write.do_syscall_64 > 3.17 -0.1 3.06 perf-profile.calltrace.cycles-pp._copy_to_iter.copy_page_to_iter.pipe_read.vfs_read.ksys_read > 1.56 -0.1 1.46 perf-profile.calltrace.cycles-pp._raw_spin_lock_irqsave.__wake_up_sync_key.pipe_write.vfs_write.ksys_write > 0.99 -0.1 0.90 perf-profile.calltrace.cycles-pp.x64_sys_call.do_syscall_64.entry_SYSCALL_64_after_hwframe.read > 1.67 -0.1 1.59 perf-profile.calltrace.cycles-pp.mutex_lock.pipe_write.vfs_write.ksys_write.do_syscall_64 > 0.64 -0.1 0.56 perf-profile.calltrace.cycles-pp.testcase > 0.96 -0.1 0.89 perf-profile.calltrace.cycles-pp.x64_sys_call.do_syscall_64.entry_SYSCALL_64_after_hwframe.write > 1.24 -0.1 1.17 perf-profile.calltrace.cycles-pp.syscall_exit_to_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe.read > 1.24 -0.1 1.17 perf-profile.calltrace.cycles-pp.syscall_exit_to_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe.write > 0.98 -0.1 0.92 perf-profile.calltrace.cycles-pp.fput.ksys_read.do_syscall_64.entry_SYSCALL_64_after_hwframe.read > 1.15 -0.1 1.10 perf-profile.calltrace.cycles-pp.fput.ksys_write.do_syscall_64.entry_SYSCALL_64_after_hwframe.write > 0.94 -0.1 0.88 perf-profile.calltrace.cycles-pp.mutex_unlock.pipe_read.vfs_read.ksys_read.do_syscall_64 > 50.06 +0.1 50.13 perf-profile.calltrace.cycles-pp.read > 1.64 +0.1 1.75 perf-profile.calltrace.cycles-pp.mutex_lock.pipe_read.vfs_read.ksys_read.do_syscall_64 > 0.59 ± 6% +0.3 0.86 ± 2% perf-profile.calltrace.cycles-pp.anon_pipe_buf_release.pipe_read.vfs_read.ksys_read.do_syscall_64 > 0.25 ±100% +0.4 0.68 ± 2% perf-profile.calltrace.cycles-pp.__cond_resched.mutex_lock.pipe_read.vfs_read.ksys_read > 0.00 +0.5 0.54 ± 2% perf-profile.calltrace.cycles-pp.timestamp_truncate.current_time.inode_needs_update_time.file_update_time.pipe_write > 0.00 +0.6 0.58 ± 3% perf-profile.calltrace.cycles-pp.ktime_get_coarse_ts64.coarse_ctime.current_time.atime_needs_update.touch_atime > 35.68 +0.8 36.45 perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.write > 34.78 +0.8 35.61 perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.write > 0.00 +1.0 1.03 ± 14% perf-profile.calltrace.cycles-pp.coarse_ctime.current_time.inode_needs_update_time.file_update_time.pipe_write > 32.94 +1.0 33.97 perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.read > 32.05 +1.1 33.13 perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.read > 30.48 +1.1 31.57 perf-profile.calltrace.cycles-pp.ksys_write.do_syscall_64.entry_SYSCALL_64_after_hwframe.write > 25.24 +1.2 26.49 perf-profile.calltrace.cycles-pp.vfs_write.ksys_write.do_syscall_64.entry_SYSCALL_64_after_hwframe.write > 27.80 +1.4 29.19 perf-profile.calltrace.cycles-pp.ksys_read.do_syscall_64.entry_SYSCALL_64_after_hwframe.read > 0.00 +1.4 1.42 ± 4% perf-profile.calltrace.cycles-pp.coarse_ctime.current_time.atime_needs_update.touch_atime.pipe_read > 22.79 +1.5 24.30 perf-profile.calltrace.cycles-pp.vfs_read.ksys_read.do_syscall_64.entry_SYSCALL_64_after_hwframe.read > 17.46 +1.6 19.04 perf-profile.calltrace.cycles-pp.pipe_write.vfs_write.ksys_write.do_syscall_64.entry_SYSCALL_64_after_hwframe > 14.85 +2.2 17.02 perf-profile.calltrace.cycles-pp.pipe_read.vfs_read.ksys_read.do_syscall_64.entry_SYSCALL_64_after_hwframe > 4.02 +2.3 6.29 perf-profile.calltrace.cycles-pp.touch_atime.pipe_read.vfs_read.ksys_read.do_syscall_64 > 1.72 +2.3 4.02 perf-profile.calltrace.cycles-pp.current_time.atime_needs_update.touch_atime.pipe_read.vfs_read > 3.63 +2.3 5.94 perf-profile.calltrace.cycles-pp.atime_needs_update.touch_atime.pipe_read.vfs_read.ksys_read > 1.78 ± 8% +2.3 4.12 ± 4% perf-profile.calltrace.cycles-pp.inode_needs_update_time.file_update_time.pipe_write.vfs_write.ksys_write > 2.06 ± 8% +2.3 4.41 ± 3% perf-profile.calltrace.cycles-pp.file_update_time.pipe_write.vfs_write.ksys_write.do_syscall_64 > 0.00 +3.3 3.34 ± 5% perf-profile.calltrace.cycles-pp.current_time.inode_needs_update_time.file_update_time.pipe_write.vfs_write > 13.71 -0.7 12.97 perf-profile.children.cycles-pp.clear_bhb_loop > 5.60 -0.4 5.15 perf-profile.children.cycles-pp.copy_page_from_iter > 6.87 -0.4 6.48 perf-profile.children.cycles-pp.entry_SYSCALL_64 > 4.28 -0.3 3.98 perf-profile.children.cycles-pp._copy_from_iter > 4.06 -0.2 3.83 perf-profile.children.cycles-pp.entry_SYSRETQ_unsafe_stack > 4.23 -0.2 4.03 perf-profile.children.cycles-pp.copy_page_to_iter > 2.06 -0.2 1.86 perf-profile.children.cycles-pp.mutex_unlock > 52.98 -0.2 52.80 perf-profile.children.cycles-pp.write > 2.14 -0.2 1.96 perf-profile.children.cycles-pp.x64_sys_call > 2.18 -0.2 2.03 perf-profile.children.cycles-pp.__wake_up_sync_key > 2.60 -0.1 2.45 perf-profile.children.cycles-pp.syscall_exit_to_user_mode > 3.43 -0.1 3.31 perf-profile.children.cycles-pp._copy_to_iter > 2.14 -0.1 2.02 perf-profile.children.cycles-pp.fput > 1.58 -0.1 1.48 perf-profile.children.cycles-pp._raw_spin_lock_irqsave > 0.74 ± 2% -0.1 0.65 perf-profile.children.cycles-pp.testcase > 0.77 -0.0 0.74 perf-profile.children.cycles-pp.entry_SYSCALL_64_safe_stack > 0.36 -0.0 0.34 perf-profile.children.cycles-pp.__x64_sys_read > 0.39 -0.0 0.37 perf-profile.children.cycles-pp.__x64_sys_write > 0.14 ± 3% -0.0 0.12 ± 6% perf-profile.children.cycles-pp.make_vfsuid > 0.26 -0.0 0.24 ± 2% perf-profile.children.cycles-pp.__wake_up_common > 0.30 -0.0 0.28 perf-profile.children.cycles-pp._raw_spin_unlock_irqrestore > 0.42 ± 3% +0.1 0.48 ± 7% perf-profile.children.cycles-pp.rcu_all_qs > 50.24 +0.1 50.31 perf-profile.children.cycles-pp.read > 1.36 ± 3% +0.1 1.44 ± 2% perf-profile.children.cycles-pp.rep_movs_alternative > 1.27 +0.1 1.41 perf-profile.children.cycles-pp.__cond_resched > 0.60 ± 6% +0.3 0.86 ± 2% perf-profile.children.cycles-pp.anon_pipe_buf_release > 0.00 +0.5 0.54 ± 8% perf-profile.children.cycles-pp.set_normalized_timespec64 > 0.46 ± 10% +0.7 1.11 ± 3% perf-profile.children.cycles-pp.timestamp_truncate > 0.00 +0.9 0.94 ± 2% perf-profile.children.cycles-pp.ktime_get_coarse_with_offset > 0.00 +1.0 0.98 ± 7% perf-profile.children.cycles-pp.ktime_get_coarse_ts64 > 0.00 +1.0 1.04 perf-profile.children.cycles-pp.ns_to_timespec64 > 30.73 +1.1 31.81 perf-profile.children.cycles-pp.ksys_write > 25.42 +1.2 26.66 perf-profile.children.cycles-pp.vfs_write > 27.96 +1.4 29.32 perf-profile.children.cycles-pp.ksys_read > 22.87 +1.5 24.37 perf-profile.children.cycles-pp.vfs_read > 17.63 +1.6 19.19 perf-profile.children.cycles-pp.pipe_write > 68.90 +1.8 70.69 perf-profile.children.cycles-pp.entry_SYSCALL_64_after_hwframe > 67.06 +1.9 68.95 perf-profile.children.cycles-pp.do_syscall_64 > 15.15 +2.1 17.30 perf-profile.children.cycles-pp.pipe_read > 4.10 +2.3 6.37 perf-profile.children.cycles-pp.touch_atime > 3.73 +2.3 6.02 perf-profile.children.cycles-pp.atime_needs_update > 1.90 ± 8% +2.3 4.20 ± 4% perf-profile.children.cycles-pp.inode_needs_update_time > 2.13 ± 7% +2.3 4.48 ± 3% perf-profile.children.cycles-pp.file_update_time > 0.00 +2.5 2.52 ± 5% perf-profile.children.cycles-pp.coarse_ctime > 1.76 +6.0 7.77 ± 2% perf-profile.children.cycles-pp.current_time > 13.63 -0.7 12.89 perf-profile.self.cycles-pp.clear_bhb_loop > 1.07 ± 5% -0.4 0.67 perf-profile.self.cycles-pp.inode_needs_update_time > 3.81 -0.3 3.46 perf-profile.self.cycles-pp._copy_from_iter > 4.10 ± 2% -0.3 3.80 ± 2% perf-profile.self.cycles-pp.vfs_read > 3.58 -0.2 3.34 perf-profile.self.cycles-pp.pipe_read > 3.35 -0.2 3.11 perf-profile.self.cycles-pp.read > 3.92 -0.2 3.70 perf-profile.self.cycles-pp.entry_SYSRETQ_unsafe_stack > 2.90 -0.2 2.70 perf-profile.self.cycles-pp.do_syscall_64 > 1.97 -0.2 1.78 perf-profile.self.cycles-pp.mutex_unlock > 3.00 -0.2 2.81 perf-profile.self.cycles-pp.entry_SYSCALL_64 > 2.90 -0.2 2.74 perf-profile.self.cycles-pp._copy_to_iter > 1.71 -0.2 1.54 perf-profile.self.cycles-pp.atime_needs_update > 2.01 -0.2 1.85 perf-profile.self.cycles-pp.x64_sys_call > 3.39 -0.1 3.25 perf-profile.self.cycles-pp.write > 1.32 ± 2% -0.1 1.18 perf-profile.self.cycles-pp.copy_page_from_iter > 1.87 -0.1 1.76 perf-profile.self.cycles-pp.entry_SYSCALL_64_after_hwframe > 2.15 -0.1 2.05 perf-profile.self.cycles-pp.mutex_lock > 1.66 -0.1 1.56 perf-profile.self.cycles-pp.syscall_exit_to_user_mode > 1.89 -0.1 1.79 perf-profile.self.cycles-pp.fput > 0.61 -0.1 0.51 perf-profile.self.cycles-pp.testcase > 1.52 -0.1 1.43 perf-profile.self.cycles-pp._raw_spin_lock_irqsave > 0.81 ± 4% -0.1 0.72 ± 2% perf-profile.self.cycles-pp.copy_page_to_iter > 1.16 -0.1 1.11 perf-profile.self.cycles-pp.ksys_write > 1.06 -0.0 1.02 perf-profile.self.cycles-pp.ksys_read > 0.36 -0.0 0.32 perf-profile.self.cycles-pp.__wake_up_sync_key > 0.37 -0.0 0.34 ± 2% perf-profile.self.cycles-pp.touch_atime > 0.76 -0.0 0.73 perf-profile.self.cycles-pp.entry_SYSCALL_64_safe_stack > 0.26 -0.0 0.24 perf-profile.self.cycles-pp.__wake_up_common > 0.27 -0.0 0.26 perf-profile.self.cycles-pp.__x64_sys_read > 0.27 -0.0 0.26 perf-profile.self.cycles-pp._raw_spin_unlock_irqrestore > 0.12 ± 3% -0.0 0.11 ± 3% perf-profile.self.cycles-pp.make_vfsuid > 0.29 -0.0 0.28 perf-profile.self.cycles-pp.__x64_sys_write > 0.26 +0.0 0.30 perf-profile.self.cycles-pp.file_update_time > 0.30 ± 4% +0.1 0.37 ± 9% perf-profile.self.cycles-pp.rcu_all_qs > 0.85 +0.1 0.92 perf-profile.self.cycles-pp.__cond_resched > 0.89 ± 5% +0.1 0.99 ± 4% perf-profile.self.cycles-pp.rep_movs_alternative > 0.56 ± 6% +0.3 0.82 ± 2% perf-profile.self.cycles-pp.anon_pipe_buf_release > 0.00 +0.5 0.52 ± 7% perf-profile.self.cycles-pp.set_normalized_timespec64 > 0.43 ± 10% +0.6 1.03 ± 3% perf-profile.self.cycles-pp.timestamp_truncate > 0.00 +0.9 0.86 perf-profile.self.cycles-pp.ns_to_timespec64 > 0.00 +0.9 0.90 ± 2% perf-profile.self.cycles-pp.ktime_get_coarse_with_offset > 0.00 +0.9 0.94 ± 8% perf-profile.self.cycles-pp.ktime_get_coarse_ts64 > 0.00 +1.0 1.02 ± 4% perf-profile.self.cycles-pp.coarse_ctime > 1.06 ± 4% +1.3 2.38 ± 3% perf-profile.self.cycles-pp.current_time > > > > > > Disclaimer: > Results have been estimated based on internal Intel analysis and are provided > for informational purposes only. Any difference in system hardware or software > design or configuration may affect actual performance. > > -- Jeff Layton <jlayton@xxxxxxxxxx>