On Fri, 2024-06-14 at 14:24 +0800, kernel test robot wrote: > > > Hello, > > kernel test robot noticed a -34.9% regression of fio.write_iops on: > > > commit: 4edee232ed5d0abb9f24af7af55e3a9aa271f993 ("xfs: switch to multigrain timestamps") > https://git.kernel.org/cgit/linux/kernel/git/jlayton/linux.git mgtime > > testcase: fio-basic > test machine: 64 threads 2 sockets Intel(R) Xeon(R) Gold 6346 CPU @ 3.10GHz (Ice Lake) with 256G memory > parameters: > > runtime: 300s > disk: 1HDD > fs: xfs > nr_task: 1 > test_size: 128G > rw: write > bs: 4k > ioengine: falloc > cpufreq_governor: performance > > > > > If you fix the issue in a separate patch/commit (i.e. not just a new version of > the same patch/commit), kindly add following tags > > Reported-by: kernel test robot <oliver.sang@xxxxxxxxx> > > Closes: https://lore.kernel.org/oe-lkp/202406141453.7a44f956-oliver.sang@xxxxxxxxx > > > Details are as below: > --------------------------------------------------------------------------------------------------> > > > The kernel config and materials to reproduce are available at: > https://download.01.org/0day-ci/archive/20240614/202406141453.7a44f956-oliver.sang@xxxxxxxxx > > ========================================================================================= > bs/compiler/cpufreq_governor/disk/fs/ioengine/kconfig/nr_task/rootfs/runtime/rw/tbox_group/test_size/testcase: > 4k/gcc-13/performance/1HDD/xfs/falloc/x86_64-rhel-8.3/1/debian-12-x86_64-20240206.cgz/300s/write/lkp-icl-2sp9/128G/fio-basic > > commit: > 61651220e0 ("fs: have setattr_copy handle multigrain timestamps appropriately") > 4edee232ed ("xfs: switch to multigrain timestamps") > > 61651220e0b91087 4edee232ed5d0abb9f24af7af55 > ---------------- --------------------------- > %stddev %change %stddev > \ | \ > 0.97 ± 3% -30.7% 0.67 ± 2% iostat.cpu.user > 2.996e+09 +51.5% 4.54e+09 cpuidle..time > 222280 ± 4% +44.7% 321595 ± 4% cpuidle..usage > 0.01 ± 5% -0.0 0.01 ± 6% mpstat.cpu.all.irq% > 0.97 ± 3% -0.3 0.66 ± 2% mpstat.cpu.all.usr% > 88.86 +27.3% 113.13 uptime.boot > 5387 +28.4% 6916 uptime.idle > 2.98 ± 3% -10.9% 2.65 ± 2% vmstat.procs.r > 3475 ± 10% -18.6% 2830 ± 6% vmstat.system.cs > 4.65 ± 43% -2.7 1.97 ±143% perf-profile.calltrace.cycles-pp._free_event.perf_event_release_kernel.perf_release.__fput.task_work_run > 4.65 ± 43% -2.7 1.97 ±143% perf-profile.children.cycles-pp._free_event > 3.33 ± 76% -2.4 0.90 ±141% perf-profile.children.cycles-pp.asm_sysvec_apic_timer_interrupt > 3.33 ± 76% -2.4 0.90 ±141% perf-profile.children.cycles-pp.sysvec_apic_timer_interrupt > 769.93 +9.4% 842.10 proc-vmstat.nr_active_anon > 3936 +2.1% 4020 proc-vmstat.nr_shmem > 769.93 +9.4% 842.10 proc-vmstat.nr_zone_active_anon > 269328 +20.8% 325325 ± 11% proc-vmstat.numa_hit > 203054 ± 2% +27.6% 259008 ± 14% proc-vmstat.numa_local > 297923 +16.3% 346459 proc-vmstat.pgalloc_normal > 181868 ± 2% +30.2% 236868 proc-vmstat.pgfault > 173268 ± 3% +27.2% 220312 proc-vmstat.pgfree > 9141 ± 7% +23.5% 11288 ± 4% proc-vmstat.pgreuse > 0.02 ± 26% +0.1 0.10 ± 6% fio.latency_10us% > 99.87 -8.4 91.43 fio.latency_2us% > 0.11 ± 20% +8.4 8.47 fio.latency_4us% > 46.16 +53.3% 70.78 fio.time.elapsed_time > 46.16 +53.3% 70.78 fio.time.elapsed_time.max > 35.68 +66.7% 59.50 fio.time.system_time > 4940 +52.6% 7538 fio.time.voluntary_context_switches > 2857 -34.9% 1859 fio.write_bw_MBps > 1176 +64.4% 1933 fio.write_clat_90%_ns > 1200 +83.1% 2197 fio.write_clat_95%_ns > 1528 +46.6% 2240 fio.write_clat_99%_ns > 1167 +62.2% 1893 fio.write_clat_mean_ns > 731537 -34.9% 476002 fio.write_iops I've been trying for several days to reproduce this, and have been unable so far. Is this the same value as "write.iops" in the json output? That's been my assumption, but I wanted to check that first. That said, I'm only getting ~500k iops at best in this test with the rig I have, so it's possible I need something faster to show it. > 0.06 ± 6% -25.5% 0.04 ± 5% perf-stat.i.MPKI > 0.91 ± 3% -0.2 0.67 ± 3% perf-stat.i.branch-miss-rate% > 27659069 ± 3% -28.0% 19920836 ± 4% perf-stat.i.branch-misses > 822504 ± 5% -25.2% 615111 ± 6% perf-stat.i.cache-misses > 7527159 ± 6% -26.9% 5499750 ± 3% perf-stat.i.cache-references > 3394 ± 11% -18.8% 2756 ± 7% perf-stat.i.context-switches > 0.46 ± 2% -13.0% 0.40 perf-stat.i.cpi > 5.727e+09 ± 2% -12.3% 5.02e+09 perf-stat.i.cpu-cycles > 74.31 -3.0% 72.05 perf-stat.i.cpu-migrations > 2.31 +13.1% 2.61 perf-stat.i.ipc > 2905 ± 2% -7.2% 2695 ± 2% perf-stat.i.minor-faults > 2905 ± 2% -7.2% 2695 ± 2% perf-stat.i.page-faults > 0.07 ± 6% -25.7% 0.05 ± 5% perf-stat.overall.MPKI > 1.18 ± 3% -0.3 0.87 ± 2% perf-stat.overall.branch-miss-rate% > 0.48 ± 2% -12.9% 0.42 perf-stat.overall.cpi > 6992 ± 6% +17.1% 8190 ± 5% perf-stat.overall.cycles-between-cache-misses > 2.09 ± 2% +14.7% 2.40 perf-stat.overall.ipc > 16640 +53.3% 25504 perf-stat.overall.path-length > 27090197 ± 3% -27.4% 19666246 ± 4% perf-stat.ps.branch-misses > 805963 ± 5% -24.6% 607413 ± 6% perf-stat.ps.cache-misses > 7402971 ± 6% -26.4% 5446622 ± 3% perf-stat.ps.cache-references > 3329 ± 11% -18.2% 2723 ± 7% perf-stat.ps.context-switches > 5.616e+09 ± 2% -11.7% 4.956e+09 perf-stat.ps.cpu-cycles > 2843 ± 2% -6.5% 2657 ± 2% perf-stat.ps.minor-faults > 2843 ± 2% -6.5% 2657 ± 2% perf-stat.ps.page-faults > 5.584e+11 +53.3% 8.558e+11 perf-stat.total.instructions > > > > > Disclaimer: > Results have been estimated based on internal Intel analysis and are provided > for informational purposes only. Any difference in system hardware or software > design or configuration may affect actual performance. > > Thanks! -- Jeff Layton <jlayton@xxxxxxxxxx>