Hello, kernel test robot noticed a 15.4% regression of filebench.sum_operations/s on: commit: 7334c4df7a384b31f30a61adb60243a8614f8ff0 ("[PATCH 1/3] nfsd: add TIME_DELEG_ACCESS and TIME_DELEG_MODIFY to writeable attrs") url: https://github.com/intel-lab-lkp/linux/commits/Jeff-Layton/nfsd-add-TIME_DELEG_ACCESS-and-TIME_DELEG_MODIFY-to-writeable-attrs/20241019-024741 patch link: https://lore.kernel.org/all/20241018-delstid-v1-1-c6021b75ff3e@xxxxxxxxxx/ patch subject: [PATCH 1/3] nfsd: add TIME_DELEG_ACCESS and TIME_DELEG_MODIFY to writeable attrs testcase: filebench config: x86_64-rhel-8.3 compiler: gcc-12 test machine: 128 threads 2 sockets Intel(R) Xeon(R) Platinum 8358 CPU @ 2.60GHz (Ice Lake) with 128G memory parameters: disk: 1HDD fs: ext4 fs2: nfsv4 test: webproxy.f cpufreq_governor: performance If you fix the issue in a separate patch/commit (i.e. not just a new version of the same patch/commit), kindly add following tags | Reported-by: kernel test robot <oliver.sang@xxxxxxxxx> | Closes: https://lore.kernel.org/oe-lkp/202410281526.4971befc-oliver.sang@xxxxxxxxx Details are as below: --------------------------------------------------------------------------------------------------> The kernel config and materials to reproduce are available at: https://download.01.org/0day-ci/archive/20241028/202410281526.4971befc-oliver.sang@xxxxxxxxx ========================================================================================= compiler/cpufreq_governor/disk/fs2/fs/kconfig/rootfs/tbox_group/test/testcase: gcc-12/performance/1HDD/nfsv4/ext4/x86_64-rhel-8.3/debian-12-x86_64-20240206.cgz/lkp-icl-2sp6/webproxy.f/filebench commit: 0f8b1a4184 ("lockd: Remove unneeded initialization of file_lock::c.flc_flags") 7334c4df7a ("nfsd: add TIME_DELEG_ACCESS and TIME_DELEG_MODIFY to writeable attrs") 0f8b1a41842544ec 7334c4df7a384b31f30a61adb60 ---------------- --------------------------- %stddev %change %stddev \ | \ 1.19 ±181% -83.9% 0.19 ± 6% perf-sched.sch_delay.max.ms.__cond_resched.__wait_for_common.affine_move_task.__set_cpus_allowed_ptr.__sched_setaffinity 2514 ± 95% -88.8% 280.49 ±168% sched_debug.cpu.max_idle_balance_cost.stddev 2152 -3.4% 2080 vmstat.system.cs 2189 +7.3% 2349 vmstat.system.in 25682 ± 41% +59.9% 41078 ± 4% numa-meminfo.node0.Shmem 20709 ± 91% -82.5% 3615 ±111% numa-meminfo.node1.Mapped 46372 ± 22% -29.8% 32541 ± 5% numa-meminfo.node1.Shmem 6419 ± 41% +60.0% 10272 ± 4% numa-vmstat.node0.nr_shmem 5241 ± 90% -81.1% 988.55 ±104% numa-vmstat.node1.nr_mapped 11592 ± 22% -29.8% 8135 ± 5% numa-vmstat.node1.nr_shmem 0.70 -14.3% 0.60 filebench.sum_bytes_mb/s 9121 -15.4% 7720 filebench.sum_operations 152.01 -15.4% 128.66 filebench.sum_operations/s 39.67 -16.0% 33.33 filebench.sum_reads/s 620.37 +19.7% 742.65 filebench.sum_time_ms/op 8.00 -12.5% 7.00 filebench.sum_writes/s 1370 +6.3% 1456 filebench.time.elapsed_time 1370 +6.3% 1456 filebench.time.elapsed_time.max 42175 -2.1% 41280 filebench.time.voluntary_context_switches 31030536 -1.2% 30643382 perf-stat.i.branch-instructions 4.86 +0.1 4.96 perf-stat.i.branch-miss-rate% 7534035 +4.1% 7839877 perf-stat.i.cache-references 2139 -3.4% 2067 perf-stat.i.context-switches 1.509e+08 -1.2% 1.491e+08 perf-stat.i.instructions 1.33 +1.1% 1.35 perf-stat.overall.cpi 30980974 -1.2% 30597842 perf-stat.ps.branch-instructions 7527450 +4.1% 7833510 perf-stat.ps.cache-references 2137 -3.3% 2066 perf-stat.ps.context-switches 1.507e+08 -1.2% 1.489e+08 perf-stat.ps.instructions 2.069e+11 +4.9% 2.171e+11 perf-stat.total.instructions 27282 ± 2% +8.6% 29623 proc-vmstat.nr_active_file 79891 +2.6% 81962 proc-vmstat.nr_dirtied 18010 +2.2% 18405 proc-vmstat.nr_shmem 79856 +2.6% 81922 proc-vmstat.nr_written 27282 ± 2% +8.6% 29623 proc-vmstat.nr_zone_active_file 2853705 +4.5% 2982082 proc-vmstat.numa_hit 2721089 +4.7% 2849533 proc-vmstat.numa_local 3437788 +3.8% 3567439 proc-vmstat.pgalloc_normal 3344808 +5.8% 3540131 proc-vmstat.pgfault 3343237 +3.7% 3466101 proc-vmstat.pgfree 923426 +4.7% 966532 proc-vmstat.pgpgout 156450 +5.7% 165442 proc-vmstat.pgreuse 3.55 ± 7% -0.6 2.96 ± 8% perf-profile.children.cycles-pp.sched_balance_rq 3.10 ± 9% -0.5 2.64 ± 5% perf-profile.children.cycles-pp.sched_balance_find_src_group 0.67 ± 34% -0.4 0.26 ± 46% perf-profile.children.cycles-pp.__wake_up_common 1.20 ± 11% -0.4 0.82 ± 9% perf-profile.children.cycles-pp.copy_process 0.57 ± 22% -0.3 0.30 ± 21% perf-profile.children.cycles-pp.dup_mm 0.65 ± 28% -0.2 0.41 ± 38% perf-profile.children.cycles-pp.free_pgtables 0.02 ±141% +0.1 0.10 ± 30% perf-profile.children.cycles-pp.rpc_run_task 0.02 ±223% +0.1 0.12 ± 39% perf-profile.children.cycles-pp.security_cred_free 0.02 ±141% +0.1 0.16 ± 50% perf-profile.children.cycles-pp.tick_nohz_tick_stopped 0.07 ± 93% +0.2 0.23 ± 25% perf-profile.children.cycles-pp.__evlist__disable 0.34 ± 29% +0.2 0.51 ± 13% perf-profile.children.cycles-pp.ct_kernel_enter 5.00 ± 9% +0.7 5.73 ± 8% perf-profile.children.cycles-pp.__irq_exit_rcu 1.90 ± 8% -0.3 1.61 ± 10% perf-profile.self.cycles-pp.update_sg_lb_stats 0.02 ±141% +0.1 0.10 ± 46% perf-profile.self.cycles-pp.__pte_offset_map 0.22 ± 36% +0.1 0.36 ± 19% perf-profile.self.cycles-pp.ct_kernel_enter Disclaimer: Results have been estimated based on internal Intel analysis and are provided for informational purposes only. Any difference in system hardware or software design or configuration may affect actual performance. -- 0-DAY CI Kernel Test Service https://github.com/intel/lkp-tests/wiki