Hello, kernel test robot noticed a -34.9% regression of fio.write_iops on: commit: 4edee232ed5d0abb9f24af7af55e3a9aa271f993 ("xfs: switch to multigrain timestamps") https://git.kernel.org/cgit/linux/kernel/git/jlayton/linux.git mgtime testcase: fio-basic test machine: 64 threads 2 sockets Intel(R) Xeon(R) Gold 6346 CPU @ 3.10GHz (Ice Lake) with 256G memory parameters: runtime: 300s disk: 1HDD fs: xfs nr_task: 1 test_size: 128G rw: write bs: 4k ioengine: falloc cpufreq_governor: performance If you fix the issue in a separate patch/commit (i.e. not just a new version of the same patch/commit), kindly add following tags | Reported-by: kernel test robot <oliver.sang@xxxxxxxxx> | Closes: https://lore.kernel.org/oe-lkp/202406141453.7a44f956-oliver.sang@xxxxxxxxx Details are as below: --------------------------------------------------------------------------------------------------> The kernel config and materials to reproduce are available at: https://download.01.org/0day-ci/archive/20240614/202406141453.7a44f956-oliver.sang@xxxxxxxxx ========================================================================================= bs/compiler/cpufreq_governor/disk/fs/ioengine/kconfig/nr_task/rootfs/runtime/rw/tbox_group/test_size/testcase: 4k/gcc-13/performance/1HDD/xfs/falloc/x86_64-rhel-8.3/1/debian-12-x86_64-20240206.cgz/300s/write/lkp-icl-2sp9/128G/fio-basic commit: 61651220e0 ("fs: have setattr_copy handle multigrain timestamps appropriately") 4edee232ed ("xfs: switch to multigrain timestamps") 61651220e0b91087 4edee232ed5d0abb9f24af7af55 ---------------- --------------------------- %stddev %change %stddev \ | \ 0.97 ± 3% -30.7% 0.67 ± 2% iostat.cpu.user 2.996e+09 +51.5% 4.54e+09 cpuidle..time 222280 ± 4% +44.7% 321595 ± 4% cpuidle..usage 0.01 ± 5% -0.0 0.01 ± 6% mpstat.cpu.all.irq% 0.97 ± 3% -0.3 0.66 ± 2% mpstat.cpu.all.usr% 88.86 +27.3% 113.13 uptime.boot 5387 +28.4% 6916 uptime.idle 2.98 ± 3% -10.9% 2.65 ± 2% vmstat.procs.r 3475 ± 10% -18.6% 2830 ± 6% vmstat.system.cs 4.65 ± 43% -2.7 1.97 ±143% perf-profile.calltrace.cycles-pp._free_event.perf_event_release_kernel.perf_release.__fput.task_work_run 4.65 ± 43% -2.7 1.97 ±143% perf-profile.children.cycles-pp._free_event 3.33 ± 76% -2.4 0.90 ±141% perf-profile.children.cycles-pp.asm_sysvec_apic_timer_interrupt 3.33 ± 76% -2.4 0.90 ±141% perf-profile.children.cycles-pp.sysvec_apic_timer_interrupt 769.93 +9.4% 842.10 proc-vmstat.nr_active_anon 3936 +2.1% 4020 proc-vmstat.nr_shmem 769.93 +9.4% 842.10 proc-vmstat.nr_zone_active_anon 269328 +20.8% 325325 ± 11% proc-vmstat.numa_hit 203054 ± 2% +27.6% 259008 ± 14% proc-vmstat.numa_local 297923 +16.3% 346459 proc-vmstat.pgalloc_normal 181868 ± 2% +30.2% 236868 proc-vmstat.pgfault 173268 ± 3% +27.2% 220312 proc-vmstat.pgfree 9141 ± 7% +23.5% 11288 ± 4% proc-vmstat.pgreuse 0.02 ± 26% +0.1 0.10 ± 6% fio.latency_10us% 99.87 -8.4 91.43 fio.latency_2us% 0.11 ± 20% +8.4 8.47 fio.latency_4us% 46.16 +53.3% 70.78 fio.time.elapsed_time 46.16 +53.3% 70.78 fio.time.elapsed_time.max 35.68 +66.7% 59.50 fio.time.system_time 4940 +52.6% 7538 fio.time.voluntary_context_switches 2857 -34.9% 1859 fio.write_bw_MBps 1176 +64.4% 1933 fio.write_clat_90%_ns 1200 +83.1% 2197 fio.write_clat_95%_ns 1528 +46.6% 2240 fio.write_clat_99%_ns 1167 +62.2% 1893 fio.write_clat_mean_ns 731537 -34.9% 476002 fio.write_iops 0.06 ± 6% -25.5% 0.04 ± 5% perf-stat.i.MPKI 0.91 ± 3% -0.2 0.67 ± 3% perf-stat.i.branch-miss-rate% 27659069 ± 3% -28.0% 19920836 ± 4% perf-stat.i.branch-misses 822504 ± 5% -25.2% 615111 ± 6% perf-stat.i.cache-misses 7527159 ± 6% -26.9% 5499750 ± 3% perf-stat.i.cache-references 3394 ± 11% -18.8% 2756 ± 7% perf-stat.i.context-switches 0.46 ± 2% -13.0% 0.40 perf-stat.i.cpi 5.727e+09 ± 2% -12.3% 5.02e+09 perf-stat.i.cpu-cycles 74.31 -3.0% 72.05 perf-stat.i.cpu-migrations 2.31 +13.1% 2.61 perf-stat.i.ipc 2905 ± 2% -7.2% 2695 ± 2% perf-stat.i.minor-faults 2905 ± 2% -7.2% 2695 ± 2% perf-stat.i.page-faults 0.07 ± 6% -25.7% 0.05 ± 5% perf-stat.overall.MPKI 1.18 ± 3% -0.3 0.87 ± 2% perf-stat.overall.branch-miss-rate% 0.48 ± 2% -12.9% 0.42 perf-stat.overall.cpi 6992 ± 6% +17.1% 8190 ± 5% perf-stat.overall.cycles-between-cache-misses 2.09 ± 2% +14.7% 2.40 perf-stat.overall.ipc 16640 +53.3% 25504 perf-stat.overall.path-length 27090197 ± 3% -27.4% 19666246 ± 4% perf-stat.ps.branch-misses 805963 ± 5% -24.6% 607413 ± 6% perf-stat.ps.cache-misses 7402971 ± 6% -26.4% 5446622 ± 3% perf-stat.ps.cache-references 3329 ± 11% -18.2% 2723 ± 7% perf-stat.ps.context-switches 5.616e+09 ± 2% -11.7% 4.956e+09 perf-stat.ps.cpu-cycles 2843 ± 2% -6.5% 2657 ± 2% perf-stat.ps.minor-faults 2843 ± 2% -6.5% 2657 ± 2% perf-stat.ps.page-faults 5.584e+11 +53.3% 8.558e+11 perf-stat.total.instructions Disclaimer: Results have been estimated based on internal Intel analysis and are provided for informational purposes only. Any difference in system hardware or software design or configuration may affect actual performance. -- 0-DAY CI Kernel Test Service https://github.com/intel/lkp-tests/wiki