On 10/7/24 15:43, kernel test robot wrote: > > > Hello, > > kernel test robot noticed a 51.9% regression of fsmark.app_overhead on: > > ( > but there is no performance difference for fsmark.files_per_sec > 18.58 -0.2% 18.55 fsmark.files_per_sec > ) > > > commit: 38f83090f515b4b5d59382dfada1e7457f19aa47 ("cpuidle: menu: Remove iowait influence") > https://git.kernel.org/cgit/linux/kernel/git/next/linux-next.git master > > testcase: fsmark > test machine: 128 threads 2 sockets Intel(R) Xeon(R) Platinum 8358 CPU @ 2.60GHz (Ice Lake) with 128G memory > parameters: > > iterations: 1x > nr_threads: 1t > disk: 1HDD > fs: btrfs > fs2: nfsv4 > filesize: 4K > test_size: 40M > sync_method: fsyncBeforeClose > nr_files_per_directory: 1fpd > cpufreq_governor: performance > > > > > If you fix the issue in a separate patch/commit (i.e. not just a new version of > the same patch/commit), kindly add following tags > | Reported-by: kernel test robot <oliver.sang@xxxxxxxxx> > | Closes: https://lore.kernel.org/oe-lkp/202410072214.11d18a3c-oliver.sang@xxxxxxxxx > > > Details are as below: > --------------------------------------------------------------------------------------------------> > > > The kernel config and materials to reproduce are available at: > https://download.01.org/0day-ci/archive/20241007/202410072214.11d18a3c-oliver.sang@xxxxxxxxx > > ========================================================================================= > compiler/cpufreq_governor/disk/filesize/fs2/fs/iterations/kconfig/nr_files_per_directory/nr_threads/rootfs/sync_method/tbox_group/test_size/testcase: > gcc-12/performance/1HDD/4K/nfsv4/btrfs/1x/x86_64-rhel-8.3/1fpd/1t/debian-12-x86_64-20240206.cgz/fsyncBeforeClose/lkp-icl-2sp6/40M/fsmark > > commit: > v6.12-rc1 > 38f83090f5 ("cpuidle: menu: Remove iowait influence") > > v6.12-rc1 38f83090f515b4b5d59382dfada > ---------------- --------------------------- > %stddev %change %stddev > \ | \ > 2032015 ± 3% +51.9% 3087623 fsmark.app_overhead > 18.58 -0.2% 18.55 fsmark.files_per_sec > 2944 -2.9% 2858 vmstat.system.cs > 0.02 +0.0 0.02 mpstat.cpu.all.irq% > 0.01 ± 2% +0.0 0.01 mpstat.cpu.all.soft% > 0.04 ± 2% +0.0 0.05 ± 3% mpstat.cpu.all.sys% > 4.07 ± 18% -53.4% 1.90 ± 53% sched_debug.cfs_rq:/.removed.runnable_avg.avg > 267.72 ± 38% -62.7% 99.92 ± 75% sched_debug.cfs_rq:/.removed.runnable_avg.max > 30.08 ± 29% -58.5% 12.50 ± 63% sched_debug.cfs_rq:/.removed.runnable_avg.stddev > 4.07 ± 18% -53.5% 1.89 ± 53% sched_debug.cfs_rq:/.removed.util_avg.avg > 267.67 ± 38% -62.7% 99.92 ± 75% sched_debug.cfs_rq:/.removed.util_avg.max > 30.08 ± 29% -58.5% 12.49 ± 63% sched_debug.cfs_rq:/.removed.util_avg.stddev > 20.43 ± 17% -25.5% 15.21 ± 16% sched_debug.cfs_rq:/.util_est.stddev > 7.85 ± 14% +21.6% 9.55 ± 12% sched_debug.cpu.clock.stddev > 0.00 ± 25% -47.7% 0.00 ± 44% sched_debug.cpu.next_balance.stddev > 0.02 ± 10% -18.9% 0.02 ± 11% sched_debug.cpu.nr_running.avg > 0.14 ± 5% -14.5% 0.12 ± 4% sched_debug.cpu.nr_running.stddev > 5.19 +0.6 5.79 perf-stat.i.branch-miss-rate% > 4096977 ± 4% +8.4% 4442600 ± 2% perf-stat.i.branch-misses > 1.79 ± 7% -0.2 1.59 ± 3% perf-stat.i.cache-miss-rate% > 11620307 +22.2% 14202690 perf-stat.i.cache-references > 2925 -3.2% 2830 perf-stat.i.context-switches > 1.68 +38.6% 2.32 perf-stat.i.cpi > 4.457e+08 ± 3% +23.8% 5.518e+08 ± 2% perf-stat.i.cpu-cycles > 1630 ± 8% +28.6% 2096 ± 4% perf-stat.i.cycles-between-cache-misses > 0.63 -25.5% 0.47 perf-stat.i.ipc > 5.26 +0.2 5.48 perf-stat.overall.branch-miss-rate% > 1.16 +18.4% 1.38 perf-stat.overall.cpi > 0.86 -15.6% 0.73 perf-stat.overall.ipc > 4103944 ± 4% +7.9% 4429579 perf-stat.ps.branch-misses > 11617199 +22.1% 14186503 perf-stat.ps.cache-references > 2919 -3.2% 2825 perf-stat.ps.context-switches > 4.492e+08 ± 3% +23.2% 5.534e+08 ± 2% perf-stat.ps.cpu-cycles The other obvious guess would be increased cache misses due to deeper idle states clearing the cache. The reduced IPC and increased cycles would indicate that, but the cache-misses don't seem to make up for that IMO.