Re: [linux-next:master] [cpuidle] 38f83090f5: fsmark.app_overhead 51.9% regression

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 10/7/24 15:43, kernel test robot wrote:
> 
> 
> Hello,
> 
> kernel test robot noticed a 51.9% regression of fsmark.app_overhead on:
> 
> (
> but there is no performance difference for fsmark.files_per_sec
>      18.58            -0.2%      18.55        fsmark.files_per_sec
> )
> 
> 
> commit: 38f83090f515b4b5d59382dfada1e7457f19aa47 ("cpuidle: menu: Remove iowait influence")
> https://git.kernel.org/cgit/linux/kernel/git/next/linux-next.git master
> 
> testcase: fsmark
> test machine: 128 threads 2 sockets Intel(R) Xeon(R) Platinum 8358 CPU @ 2.60GHz (Ice Lake) with 128G memory
> parameters:
> 
> 	iterations: 1x
> 	nr_threads: 1t
> 	disk: 1HDD
> 	fs: btrfs
> 	fs2: nfsv4
> 	filesize: 4K
> 	test_size: 40M
> 	sync_method: fsyncBeforeClose
> 	nr_files_per_directory: 1fpd
> 	cpufreq_governor: performance
> 
> 
> 
> 
> If you fix the issue in a separate patch/commit (i.e. not just a new version of
> the same patch/commit), kindly add following tags
> | Reported-by: kernel test robot <oliver.sang@xxxxxxxxx>
> | Closes: https://lore.kernel.org/oe-lkp/202410072214.11d18a3c-oliver.sang@xxxxxxxxx
> 
> 
> Details are as below:
> -------------------------------------------------------------------------------------------------->
> 
> 
> The kernel config and materials to reproduce are available at:
> https://download.01.org/0day-ci/archive/20241007/202410072214.11d18a3c-oliver.sang@xxxxxxxxx
> 
> =========================================================================================
> compiler/cpufreq_governor/disk/filesize/fs2/fs/iterations/kconfig/nr_files_per_directory/nr_threads/rootfs/sync_method/tbox_group/test_size/testcase:
>   gcc-12/performance/1HDD/4K/nfsv4/btrfs/1x/x86_64-rhel-8.3/1fpd/1t/debian-12-x86_64-20240206.cgz/fsyncBeforeClose/lkp-icl-2sp6/40M/fsmark
> 
> commit: 
>   v6.12-rc1
>   38f83090f5 ("cpuidle: menu: Remove iowait influence")
> 
>        v6.12-rc1 38f83090f515b4b5d59382dfada 
> ---------------- --------------------------- 
>          %stddev     %change         %stddev
>              \          |                \  
>    2032015 ±  3%     +51.9%    3087623        fsmark.app_overhead
>      18.58            -0.2%      18.55        fsmark.files_per_sec
>       2944            -2.9%       2858        vmstat.system.cs
>       0.02            +0.0        0.02        mpstat.cpu.all.irq%
>       0.01 ±  2%      +0.0        0.01        mpstat.cpu.all.soft%
>       0.04 ±  2%      +0.0        0.05 ±  3%  mpstat.cpu.all.sys%
>       4.07 ± 18%     -53.4%       1.90 ± 53%  sched_debug.cfs_rq:/.removed.runnable_avg.avg
>     267.72 ± 38%     -62.7%      99.92 ± 75%  sched_debug.cfs_rq:/.removed.runnable_avg.max
>      30.08 ± 29%     -58.5%      12.50 ± 63%  sched_debug.cfs_rq:/.removed.runnable_avg.stddev
>       4.07 ± 18%     -53.5%       1.89 ± 53%  sched_debug.cfs_rq:/.removed.util_avg.avg
>     267.67 ± 38%     -62.7%      99.92 ± 75%  sched_debug.cfs_rq:/.removed.util_avg.max
>      30.08 ± 29%     -58.5%      12.49 ± 63%  sched_debug.cfs_rq:/.removed.util_avg.stddev
>      20.43 ± 17%     -25.5%      15.21 ± 16%  sched_debug.cfs_rq:/.util_est.stddev
>       7.85 ± 14%     +21.6%       9.55 ± 12%  sched_debug.cpu.clock.stddev
>       0.00 ± 25%     -47.7%       0.00 ± 44%  sched_debug.cpu.next_balance.stddev
>       0.02 ± 10%     -18.9%       0.02 ± 11%  sched_debug.cpu.nr_running.avg
>       0.14 ±  5%     -14.5%       0.12 ±  4%  sched_debug.cpu.nr_running.stddev
>       5.19            +0.6        5.79        perf-stat.i.branch-miss-rate%
>    4096977 ±  4%      +8.4%    4442600 ±  2%  perf-stat.i.branch-misses
>       1.79 ±  7%      -0.2        1.59 ±  3%  perf-stat.i.cache-miss-rate%
>   11620307           +22.2%   14202690        perf-stat.i.cache-references
>       2925            -3.2%       2830        perf-stat.i.context-switches
>       1.68           +38.6%       2.32        perf-stat.i.cpi
>  4.457e+08 ±  3%     +23.8%  5.518e+08 ±  2%  perf-stat.i.cpu-cycles
>       1630 ±  8%     +28.6%       2096 ±  4%  perf-stat.i.cycles-between-cache-misses
>       0.63           -25.5%       0.47        perf-stat.i.ipc
>       5.26            +0.2        5.48        perf-stat.overall.branch-miss-rate%
>       1.16           +18.4%       1.38        perf-stat.overall.cpi
>       0.86           -15.6%       0.73        perf-stat.overall.ipc
>    4103944 ±  4%      +7.9%    4429579        perf-stat.ps.branch-misses
>   11617199           +22.1%   14186503        perf-stat.ps.cache-references
>       2919            -3.2%       2825        perf-stat.ps.context-switches
>  4.492e+08 ±  3%     +23.2%  5.534e+08 ±  2%  perf-stat.ps.cpu-cycles

The other obvious guess would be increased cache misses due to deeper idle
states clearing the cache. The reduced IPC and increased cycles would indicate
that, but the cache-misses don't seem to make up for that IMO.




[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux