[linus:master] [filelock] c69ff40719: stress-ng.dup.ops_per_sec 1.9% improvement

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




Hello,

kernel test robot noticed a 1.9% improvement of stress-ng.dup.ops_per_sec on:


commit: c69ff4071935f946f1cddc59e1d36a03442ed015 ("filelock: split leases out of struct file_lock")
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master

testcase: stress-ng
test machine: 64 threads 2 sockets Intel(R) Xeon(R) Gold 6346 CPU @ 3.10GHz (Ice Lake) with 256G memory
parameters:

	nr_threads: 100%
	disk: 1HDD
	testtime: 60s
	fs: ext4
	test: dup
	cpufreq_governor: performance






Details are as below:
-------------------------------------------------------------------------------------------------->


The kernel config and materials to reproduce are available at:
https://download.01.org/0day-ci/archive/20240403/202404031033.c2d3b356-oliver.sang@xxxxxxxxx

=========================================================================================
compiler/cpufreq_governor/disk/fs/kconfig/nr_threads/rootfs/tbox_group/test/testcase/testtime:
  gcc-12/performance/1HDD/ext4/x86_64-rhel-8.3/100%/debian-12-x86_64-20240206.cgz/lkp-icl-2sp8/dup/stress-ng/60s

commit: 
  282c30f320 ("filelock: remove temporary compatibility macros")
  c69ff40719 ("filelock: split leases out of struct file_lock")

282c30f320ba2579 c69ff4071935f946f1cddc59e1d 
---------------- --------------------------- 
         %stddev     %change         %stddev
             \          |                \  
    195388            +2.0%     199324        vmstat.system.cs
   1502041            +1.9%    1531046        stress-ng.dup.ops
     25032            +1.9%      25516        stress-ng.dup.ops_per_sec
      2020            -1.9%       1982        stress-ng.time.system_time
    176.48           +11.1%     196.06        stress-ng.time.user_time
   3992532            +1.8%    4063489        stress-ng.time.voluntary_context_switches
 1.949e+10            +2.3%  1.994e+10        perf-stat.i.branch-instructions
      1.51            -3.2%       1.46        perf-stat.i.cpi
 9.495e+10            +2.3%  9.711e+10        perf-stat.i.instructions
      0.67            +3.6%       0.70        perf-stat.i.ipc
      1.51            -3.4%       1.46        perf-stat.overall.cpi
      0.66            +3.5%       0.69        perf-stat.overall.ipc
    198601            +1.9%     202371        perf-stat.ps.context-switches
     16.89            -3.1       13.75        perf-profile.calltrace.cycles-pp.filp_flush.filp_close.put_files_struct.do_exit.do_group_exit
     24.02            -2.8       21.19        perf-profile.calltrace.cycles-pp.filp_close.put_files_struct.do_exit.do_group_exit.__x64_sys_exit_group
     12.92            -2.7       10.25        perf-profile.calltrace.cycles-pp.locks_remove_posix.filp_flush.filp_close.put_files_struct.do_exit
     33.85            -2.5       31.32        perf-profile.calltrace.cycles-pp.put_files_struct.do_exit.do_group_exit.__x64_sys_exit_group.do_syscall_64
     53.34            -1.8       51.51        perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe
     53.34            -1.8       51.51        perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe
     53.31            -1.8       51.47        perf-profile.calltrace.cycles-pp.__x64_sys_exit_group.do_syscall_64.entry_SYSCALL_64_after_hwframe
     53.31            -1.8       51.47        perf-profile.calltrace.cycles-pp.do_group_exit.__x64_sys_exit_group.do_syscall_64.entry_SYSCALL_64_after_hwframe
     54.16            -1.8       52.34        perf-profile.calltrace.cycles-pp.do_exit.do_group_exit.__x64_sys_exit_group.do_syscall_64.entry_SYSCALL_64_after_hwframe
      0.62            +0.0        0.64        perf-profile.calltrace.cycles-pp.do_dentry_open.do_open.path_openat.do_filp_open.do_sys_openat2
      0.74            +0.0        0.77        perf-profile.calltrace.cycles-pp.acct_collect.do_exit.do_group_exit.__x64_sys_exit_group.do_syscall_64
      0.86            +0.0        0.89        perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe._exit
      0.86            +0.0        0.89        perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe._exit
      0.60            +0.0        0.63        perf-profile.calltrace.cycles-pp.rwsem_spin_on_owner.rwsem_optimistic_spin.rwsem_down_write_slowpath.down_write.anon_vma_clone
      0.86            +0.0        0.88        perf-profile.calltrace.cycles-pp.do_group_exit.__x64_sys_exit_group.do_syscall_64.entry_SYSCALL_64_after_hwframe._exit
      0.86            +0.0        0.89        perf-profile.calltrace.cycles-pp._exit
      0.68            +0.0        0.72        perf-profile.calltrace.cycles-pp.do_open.path_openat.do_filp_open.do_sys_openat2.__x64_sys_openat
      0.82            +0.0        0.86        perf-profile.calltrace.cycles-pp.up_write.free_pgtables.exit_mmap.__mmput.exit_mm
      0.70 ±  2%      +0.0        0.74        perf-profile.calltrace.cycles-pp.kmem_cache_alloc.anon_vma_clone.anon_vma_fork.dup_mmap.dup_mm
      0.81            +0.0        0.85        perf-profile.calltrace.cycles-pp.do_wait.kernel_wait4.__do_sys_wait4.do_syscall_64.entry_SYSCALL_64_after_hwframe
      0.59            +0.0        0.63 ±  5%  perf-profile.calltrace.cycles-pp.__libc_fork
      0.85            +0.0        0.89        perf-profile.calltrace.cycles-pp.__do_sys_wait4.do_syscall_64.entry_SYSCALL_64_after_hwframe.wait4
      0.85            +0.0        0.88        perf-profile.calltrace.cycles-pp.kernel_wait4.__do_sys_wait4.do_syscall_64.entry_SYSCALL_64_after_hwframe.wait4
      0.65            +0.0        0.69        perf-profile.calltrace.cycles-pp.rwsem_spin_on_owner.rwsem_optimistic_spin.rwsem_down_write_slowpath.down_write.anon_vma_fork
      1.09            +0.0        1.14        perf-profile.calltrace.cycles-pp.kmem_cache_free.exit_mmap.__mmput.exit_mm.do_exit
      0.95            +0.0        1.00        perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.wait4
      0.96            +0.0        1.00        perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.wait4
      0.94            +0.0        0.98        perf-profile.calltrace.cycles-pp.rwsem_optimistic_spin.rwsem_down_write_slowpath.down_write.anon_vma_clone.anon_vma_fork
      0.98            +0.0        1.02        perf-profile.calltrace.cycles-pp.wait4
      0.55            +0.0        0.60 ±  7%  perf-profile.calltrace.cycles-pp.stress_dup
      0.98            +0.0        1.03        perf-profile.calltrace.cycles-pp.rwsem_down_write_slowpath.down_write.anon_vma_clone.anon_vma_fork.dup_mmap
      1.44            +0.0        1.48        perf-profile.calltrace.cycles-pp.do_filp_open.do_sys_openat2.__x64_sys_openat.do_syscall_64.entry_SYSCALL_64_after_hwframe
      1.42            +0.0        1.47        perf-profile.calltrace.cycles-pp.path_openat.do_filp_open.do_sys_openat2.__x64_sys_openat.do_syscall_64
      1.86            +0.1        1.91        perf-profile.calltrace.cycles-pp.__x64_sys_openat.do_syscall_64.entry_SYSCALL_64_after_hwframe.open64
      1.85            +0.1        1.90        perf-profile.calltrace.cycles-pp.do_sys_openat2.__x64_sys_openat.do_syscall_64.entry_SYSCALL_64_after_hwframe.open64
      1.18            +0.1        1.24        perf-profile.calltrace.cycles-pp.__do_sys_clone.do_syscall_64.entry_SYSCALL_64_after_hwframe.__clone
      1.18            +0.1        1.24        perf-profile.calltrace.cycles-pp.kernel_clone.__do_sys_clone.do_syscall_64.entry_SYSCALL_64_after_hwframe.__clone
      1.18            +0.1        1.24        perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.__clone
      1.18            +0.1        1.24        perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.__clone
      1.51            +0.1        1.57        perf-profile.calltrace.cycles-pp.cpuidle_idle_call.do_idle.cpu_startup_entry.start_secondary.secondary_startup_64_no_verify
      1.74            +0.1        1.80        perf-profile.calltrace.cycles-pp.kmem_cache_alloc.vm_area_dup.dup_mmap.dup_mm.copy_process
      1.60            +0.1        1.67        perf-profile.calltrace.cycles-pp.kmem_cache_free.unlink_anon_vmas.free_pgtables.exit_mmap.__mmput
      1.83            +0.1        1.89        perf-profile.calltrace.cycles-pp.start_secondary.secondary_startup_64_no_verify
      1.86            +0.1        1.93        perf-profile.calltrace.cycles-pp.secondary_startup_64_no_verify
      1.82            +0.1        1.89        perf-profile.calltrace.cycles-pp.cpu_startup_entry.start_secondary.secondary_startup_64_no_verify
      1.52            +0.1        1.59        perf-profile.calltrace.cycles-pp.__clone
      1.82            +0.1        1.89        perf-profile.calltrace.cycles-pp.do_idle.cpu_startup_entry.start_secondary.secondary_startup_64_no_verify
      1.28            +0.1        1.36        perf-profile.calltrace.cycles-pp.rwsem_down_write_slowpath.down_write.anon_vma_fork.dup_mmap.dup_mm
      1.36            +0.1        1.44        perf-profile.calltrace.cycles-pp.down_write.anon_vma_fork.dup_mmap.dup_mm.copy_process
      1.22            +0.1        1.30        perf-profile.calltrace.cycles-pp.rwsem_optimistic_spin.rwsem_down_write_slowpath.down_write.anon_vma_fork.dup_mmap
      2.38            +0.1        2.48        perf-profile.calltrace.cycles-pp.vm_area_dup.dup_mmap.dup_mm.copy_process.kernel_clone
      3.34            +0.2        3.51        perf-profile.calltrace.cycles-pp.anon_vma_interval_tree_insert.anon_vma_clone.anon_vma_fork.dup_mmap.dup_mm
      5.14            +0.2        5.37        perf-profile.calltrace.cycles-pp.unlink_anon_vmas.free_pgtables.exit_mmap.__mmput.exit_mm
      6.76            +0.3        7.04        perf-profile.calltrace.cycles-pp.anon_vma_clone.anon_vma_fork.dup_mmap.dup_mm.copy_process
      5.87            +0.3        6.16        perf-profile.calltrace.cycles-pp.fput.filp_close.put_files_struct.do_exit.do_group_exit
      7.49            +0.3        7.82        perf-profile.calltrace.cycles-pp.free_pgtables.exit_mmap.__mmput.exit_mm.do_exit
      9.30            +0.4        9.69        perf-profile.calltrace.cycles-pp.anon_vma_fork.dup_mmap.dup_mm.copy_process.kernel_clone
      0.10 ±200%      +0.4        0.52        perf-profile.calltrace.cycles-pp.fifo_open.do_dentry_open.do_open.path_openat.do_filp_open
      7.40            +0.5        7.86        perf-profile.calltrace.cycles-pp.dup_fd.copy_process.kernel_clone.__do_sys_clone.do_syscall_64
      0.05 ±299%      +0.5        0.52 ±  2%  perf-profile.calltrace.cycles-pp.osq_lock.rwsem_optimistic_spin.rwsem_down_write_slowpath.down_write.anon_vma_fork
     17.56            +0.6       18.18        perf-profile.calltrace.cycles-pp.exit_mmap.__mmput.exit_mm.do_exit.do_group_exit
     17.62            +0.6       18.24        perf-profile.calltrace.cycles-pp.__mmput.exit_mm.do_exit.do_group_exit.__x64_sys_exit_group
     17.63            +0.6       18.25        perf-profile.calltrace.cycles-pp.exit_mm.do_exit.do_group_exit.__x64_sys_exit_group.do_syscall_64
     19.31            +0.7       20.03        perf-profile.calltrace.cycles-pp.dup_mmap.dup_mm.copy_process.kernel_clone.__do_sys_clone
     19.75            +0.7       20.48        perf-profile.calltrace.cycles-pp.dup_mm.copy_process.kernel_clone.__do_sys_clone.do_syscall_64
     28.58            +1.3       29.84        perf-profile.calltrace.cycles-pp.__do_sys_clone.do_syscall_64.entry_SYSCALL_64_after_hwframe._Fork
     28.58            +1.3       29.84        perf-profile.calltrace.cycles-pp.kernel_clone.__do_sys_clone.do_syscall_64.entry_SYSCALL_64_after_hwframe._Fork
     28.59            +1.3       29.85        perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe._Fork
     28.59            +1.3       29.86        perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe._Fork
     29.12            +1.3       30.41        perf-profile.calltrace.cycles-pp._Fork
     29.02            +1.3       30.31        perf-profile.calltrace.cycles-pp.copy_process.kernel_clone.__do_sys_clone.do_syscall_64.entry_SYSCALL_64_after_hwframe
     17.87            -3.2       14.72        perf-profile.children.cycles-pp.filp_flush
     25.00            -2.9       22.12        perf-profile.children.cycles-pp.filp_close
     13.27            -2.7       10.58        perf-profile.children.cycles-pp.locks_remove_posix
     34.49            -2.5       31.99        perf-profile.children.cycles-pp.put_files_struct
     54.16            -1.8       52.36        perf-profile.children.cycles-pp.__x64_sys_exit_group
     54.16            -1.8       52.35        perf-profile.children.cycles-pp.do_exit
     54.17            -1.8       52.36        perf-profile.children.cycles-pp.do_group_exit
     88.90            -0.4       88.52        perf-profile.children.cycles-pp.entry_SYSCALL_64_after_hwframe
     88.88            -0.4       88.50        perf-profile.children.cycles-pp.do_syscall_64
      0.46 ±  2%      +0.0        0.48        perf-profile.children.cycles-pp.asm_sysvec_call_function_single
      0.24 ±  2%      +0.0        0.26 ±  4%  perf-profile.children.cycles-pp.memcg_account_kmem
      0.51            +0.0        0.53        perf-profile.children.cycles-pp.find_idlest_cpu
      0.61            +0.0        0.63        perf-profile.children.cycles-pp.irq_exit_rcu
      0.66            +0.0        0.69        perf-profile.children.cycles-pp.mas_next_slot
      0.44            +0.0        0.46        perf-profile.children.cycles-pp.___slab_alloc
      0.36            +0.0        0.38        perf-profile.children.cycles-pp.mm_init
      0.75            +0.0        0.78        perf-profile.children.cycles-pp.acct_collect
      0.47            +0.0        0.49        perf-profile.children.cycles-pp.dup_userfaultfd
      0.49            +0.0        0.52        perf-profile.children.cycles-pp.fifo_open
      0.55            +0.0        0.58        perf-profile.children.cycles-pp.lock_vma_under_rcu
      0.78            +0.0        0.80        perf-profile.children.cycles-pp.rcu_do_batch
      0.18 ±  3%      +0.0        0.21 ± 17%  perf-profile.children.cycles-pp.process_one_work
      0.70            +0.0        0.73        perf-profile.children.cycles-pp.wake_up_new_task
      0.84            +0.0        0.86        perf-profile.children.cycles-pp.mas_find
      0.87            +0.0        0.90        perf-profile.children.cycles-pp._exit
      0.62            +0.0        0.65        perf-profile.children.cycles-pp.do_dentry_open
      0.85            +0.0        0.88        perf-profile.children.cycles-pp.load_balance
      0.69            +0.0        0.72        perf-profile.children.cycles-pp.do_open
      0.44 ±  2%      +0.0        0.48        perf-profile.children.cycles-pp.__pte_offset_map_lock
      0.92            +0.0        0.95        perf-profile.children.cycles-pp.newidle_balance
      1.00            +0.0        1.03        perf-profile.children.cycles-pp.pick_next_task_fair
      0.85            +0.0        0.89        perf-profile.children.cycles-pp.__do_sys_wait4
      0.85            +0.0        0.88        perf-profile.children.cycles-pp.kernel_wait4
      0.98            +0.0        1.02        perf-profile.children.cycles-pp.wait4
      1.05 ±  2%      +0.0        1.09        perf-profile.children.cycles-pp.__vm_area_free
      0.65            +0.0        0.69 ±  5%  perf-profile.children.cycles-pp.__libc_fork
      0.97            +0.0        1.01        perf-profile.children.cycles-pp.schedule
      1.43            +0.0        1.47        perf-profile.children.cycles-pp.path_openat
      1.44            +0.0        1.49        perf-profile.children.cycles-pp.do_filp_open
      0.45 ±  2%      +0.0        0.50 ±  3%  perf-profile.children.cycles-pp.memset_orig
      0.61            +0.0        0.66 ±  8%  perf-profile.children.cycles-pp.stress_dup
      1.13            +0.1        1.18        perf-profile.children.cycles-pp.do_wait
      0.57 ±  2%      +0.1        0.62 ±  6%  perf-profile.children.cycles-pp.native_queued_spin_lock_slowpath
      1.86            +0.1        1.92        perf-profile.children.cycles-pp.__x64_sys_openat
      1.86            +0.1        1.91        perf-profile.children.cycles-pp.do_sys_openat2
      1.06            +0.1        1.12 ±  3%  perf-profile.children.cycles-pp.ret_from_fork_asm
      1.54            +0.1        1.60        perf-profile.children.cycles-pp.cpuidle_idle_call
      1.68            +0.1        1.74        perf-profile.children.cycles-pp.__schedule
      1.83            +0.1        1.89        perf-profile.children.cycles-pp.start_secondary
      1.86            +0.1        1.93        perf-profile.children.cycles-pp.cpu_startup_entry
      1.86            +0.1        1.93        perf-profile.children.cycles-pp.secondary_startup_64_no_verify
      1.77            +0.1        1.84        perf-profile.children.cycles-pp.__slab_free
      1.85            +0.1        1.92        perf-profile.children.cycles-pp.do_idle
      1.54            +0.1        1.61        perf-profile.children.cycles-pp.__clone
      1.09 ±  2%      +0.1        1.17 ±  2%  perf-profile.children.cycles-pp.__anon_vma_interval_tree_remove
      1.23            +0.1        1.31 ±  2%  perf-profile.children.cycles-pp._raw_spin_lock
      2.51            +0.1        2.59        perf-profile.children.cycles-pp.up_write
      2.44 ±  2%      +0.1        2.53        perf-profile.children.cycles-pp.__memcg_slab_free_hook
      2.40            +0.1        2.50        perf-profile.children.cycles-pp.vm_area_dup
      1.71            +0.1        1.81        perf-profile.children.cycles-pp.rwsem_spin_on_owner
      3.53            +0.1        3.65        perf-profile.children.cycles-pp.kmem_cache_alloc
      3.37            +0.2        3.53        perf-profile.children.cycles-pp.anon_vma_interval_tree_insert
      3.03            +0.2        3.20        perf-profile.children.cycles-pp.rwsem_optimistic_spin
      3.21            +0.2        3.38        perf-profile.children.cycles-pp.rwsem_down_write_slowpath
      4.73            +0.2        4.91        perf-profile.children.cycles-pp.kmem_cache_free
      5.16            +0.2        5.39        perf-profile.children.cycles-pp.unlink_anon_vmas
      5.23            +0.2        5.46        perf-profile.children.cycles-pp.down_write
      6.77            +0.3        7.04        perf-profile.children.cycles-pp.anon_vma_clone
      6.56            +0.3        6.84        perf-profile.children.cycles-pp.fput
      7.51            +0.3        7.84        perf-profile.children.cycles-pp.free_pgtables
      9.32            +0.4        9.71        perf-profile.children.cycles-pp.anon_vma_fork
      7.40            +0.5        7.86        perf-profile.children.cycles-pp.dup_fd
     17.58            +0.6       18.20        perf-profile.children.cycles-pp.exit_mmap
     17.67            +0.6       18.29        perf-profile.children.cycles-pp.exit_mm
     17.62            +0.6       18.24        perf-profile.children.cycles-pp.__mmput
     19.36            +0.7       20.08        perf-profile.children.cycles-pp.dup_mmap
     19.75            +0.7       20.48        perf-profile.children.cycles-pp.dup_mm
     29.18            +1.3       30.47        perf-profile.children.cycles-pp._Fork
     29.03            +1.3       30.32        perf-profile.children.cycles-pp.copy_process
     29.76            +1.3       31.08        perf-profile.children.cycles-pp.__do_sys_clone
     29.76            +1.3       31.08        perf-profile.children.cycles-pp.kernel_clone
     12.85            -2.7       10.18        perf-profile.self.cycles-pp.locks_remove_posix
      3.16            -0.4        2.80        perf-profile.self.cycles-pp.filp_flush
      0.72            +0.0        0.74        perf-profile.self.cycles-pp.kmem_cache_alloc
      0.50            +0.0        0.53        perf-profile.self.cycles-pp.kmem_cache_free
      0.81            +0.0        0.85        perf-profile.self.cycles-pp._raw_spin_lock
      0.44 ±  2%      +0.0        0.49 ±  3%  perf-profile.self.cycles-pp.memset_orig
      0.57 ±  2%      +0.1        0.62 ±  6%  perf-profile.self.cycles-pp.native_queued_spin_lock_slowpath
      1.70            +0.1        1.76        perf-profile.self.cycles-pp.__slab_free
      2.46            +0.1        2.54        perf-profile.self.cycles-pp.up_write
      1.70            +0.1        1.79        perf-profile.self.cycles-pp.rwsem_spin_on_owner
      3.33            +0.2        3.50        perf-profile.self.cycles-pp.anon_vma_interval_tree_insert
      6.13            +0.3        6.44        perf-profile.self.cycles-pp.fput
      7.02            +0.4        7.43        perf-profile.self.cycles-pp.dup_fd
      7.28            +0.6        7.85        perf-profile.self.cycles-pp.put_files_struct




Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.


-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki





[Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux