Re: [linux-next:master] [fsnotify] a5e57b4d37: stress-ng.full.ops_per_sec -17.3% regression

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Apr 11, 2024 at 4:42 AM kernel test robot <oliver.sang@xxxxxxxxx> wrote:
>
>
> hi, Amir,
>
> for "[amir73il:fsnotify-sbconn] [fsnotify]  629f30e073: unixbench.throughput 5.8% improvement"
> (https://lore.kernel.org/all/202403141505.807a722b-oliver.sang@xxxxxxxxx/)
> you requested us to test unixbench for this commit on different branches and we
> observed consistent performance improvement.
>
> now we noticed this commit is merged into linux-next/master, we still observed
> similar unixbench improvement, however, we also captured a stress-ng regression
> now. below details FYI.
>
>
>
> Hello,
>
> kernel test robot noticed a -17.3% regression of stress-ng.full.ops_per_sec on:
>
>
> commit: a5e57b4d370c6d320e5bfb0c919fe00aee29e039 ("fsnotify: optimize the case of no permission event watchers")

Odd. This commit does add an extra fsnotify_sb_has_priority_watchers()
inline check for reads and writes, but the inline helper
fsnotify_sb_has_watchers()
already exists in fsnotify_parent() and it already accesses fsnotify_sb_info.

It seems like stress-ng.full does read/write/mmap operations on /dev/full,
so the fsnotify_sb_info object would be that of devtmpfs.

I think that the permission events on special files are not very relevant,
but I am not sure.

Jan, any ideas?

Thanks,
Amir.



> https://git.kernel.org/cgit/linux/kernel/git/next/linux-next.git master
>
> testcase: stress-ng
> test machine: 64 threads 2 sockets Intel(R) Xeon(R) Gold 6346 CPU @ 3.10GHz (Ice Lake) with 256G memory
> parameters:
>
>         nr_threads: 100%
>         testtime: 60s
>         test: full
>         cpufreq_governor: performance
>
>
> In addition to that, the commit also has significant impact on the following tests:
>
> +------------------+-------------------------------------------------------------------------------------------------+
> | testcase: change | unixbench: unixbench.throughput 6.4% improvement                                                |
> | test machine     | 36 threads 1 sockets Intel(R) Core(TM) i9-10980XE CPU @ 3.00GHz (Cascade Lake) with 128G memory |
> | test parameters  | cpufreq_governor=performance                                                                    |
> |                  | nr_task=1                                                                                       |
> |                  | runtime=300s                                                                                    |
> |                  | test=fsbuffer-r                                                                                 |
> +------------------+-------------------------------------------------------------------------------------------------+
> | testcase: change | unixbench: unixbench.throughput 5.8% improvement                                                |
> | test machine     | 36 threads 1 sockets Intel(R) Core(TM) i9-10980XE CPU @ 3.00GHz (Cascade Lake) with 128G memory |
> | test parameters  | cpufreq_governor=performance                                                                    |
> |                  | nr_task=1                                                                                       |
> |                  | runtime=300s                                                                                    |
> |                  | test=fstime-r                                                                                   |
> +------------------+-------------------------------------------------------------------------------------------------+
>
>
> If you fix the issue in a separate patch/commit (i.e. not just a new version of
> the same patch/commit), kindly add following tags
> | Reported-by: kernel test robot <oliver.sang@xxxxxxxxx>
> | Closes: https://lore.kernel.org/oe-lkp/202404101624.85684be8-oliver.sang@xxxxxxxxx
>
>
> Details are as below:
> -------------------------------------------------------------------------------------------------->
>
>
> The kernel config and materials to reproduce are available at:
> https://download.01.org/0day-ci/archive/20240410/202404101624.85684be8-oliver.sang@xxxxxxxxx
>
> =========================================================================================
> compiler/cpufreq_governor/kconfig/nr_threads/rootfs/tbox_group/test/testcase/testtime:
>   gcc-13/performance/x86_64-rhel-8.3/100%/debian-12-x86_64-20240206.cgz/lkp-icl-2sp8/full/stress-ng/60s
>
> commit:
>   477cf917dd ("fsnotify: use an enum for group priority constants")
>   a5e57b4d37 ("fsnotify: optimize the case of no permission event watchers")
>
> 477cf917dd02853b a5e57b4d370c6d320e5bfb0c919
> ---------------- ---------------------------
>          %stddev     %change         %stddev
>              \          |                \
>      20489 ą  7%     -19.2%      16565 ą 13%  perf-c2c.HITM.remote
>     409.48 ą  9%     -14.0%     352.13 ą  5%  sched_debug.cfs_rq:/.util_est.avg
>     217.94 ą  8%     +12.9%     246.07 ą  4%  sched_debug.cfs_rq:/.util_est.stddev
>  1.461e+08 ą  3%     -17.3%  1.208e+08 ą  5%  stress-ng.full.ops
>    2434462 ą  3%     -17.3%    2013444 ą  5%  stress-ng.full.ops_per_sec
>      71.04 ą  3%     -16.6%      59.28 ą  6%  stress-ng.time.user_time
>   9.95e+09 ą  4%     -13.4%  8.617e+09 ą  3%  perf-stat.i.branch-instructions
>       0.48 ą  3%      +0.1        0.55 ą  2%  perf-stat.i.branch-miss-rate%
>       4.36 ą  4%     +17.1%       5.10 ą  3%  perf-stat.i.cpi
>  5.162e+10 ą  4%     -14.5%  4.416e+10 ą  3%  perf-stat.i.instructions
>       0.24 ą  3%     -13.8%       0.21 ą  3%  perf-stat.i.ipc
>       0.46 ą  3%      +0.1        0.54 ą  2%  perf-stat.overall.branch-miss-rate%
>       4.38 ą  4%     +16.9%       5.12 ą  3%  perf-stat.overall.cpi
>       0.23 ą  4%     -14.5%       0.20 ą  3%  perf-stat.overall.ipc
>  9.781e+09 ą  4%     -13.4%  8.471e+09 ą  3%  perf-stat.ps.branch-instructions
>  5.075e+10 ą  4%     -14.5%  4.341e+10 ą  3%  perf-stat.ps.instructions
>  3.111e+12 ą  4%     -14.5%   2.66e+12 ą  3%  perf-stat.total.instructions
>       8.39 ą  7%      -2.8        5.56 ą  4%  perf-profile.calltrace.cycles-pp.__mmap
>       8.09 ą  7%      -2.8        5.31 ą  4%  perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.__mmap
>       8.05 ą  7%      -2.8        5.28 ą  4%  perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.__mmap
>       7.95 ą  7%      -2.8        5.19 ą  4%  perf-profile.calltrace.cycles-pp.ksys_mmap_pgoff.do_syscall_64.entry_SYSCALL_64_after_hwframe.__mmap
>       6.80 ą  8%      -2.7        4.14 ą  4%  perf-profile.calltrace.cycles-pp.security_file_open.do_dentry_open.do_open.path_openat.do_filp_open
>       7.46 ą  8%      -2.7        4.80 ą  4%  perf-profile.calltrace.cycles-pp.vm_mmap_pgoff.ksys_mmap_pgoff.do_syscall_64.entry_SYSCALL_64_after_hwframe.__mmap
>       6.78 ą  8%      -2.7        4.13 ą  4%  perf-profile.calltrace.cycles-pp.apparmor_file_open.security_file_open.do_dentry_open.do_open.path_openat
>       4.12 ą 14%      -2.0        2.09 ą 10%  perf-profile.calltrace.cycles-pp.security_mmap_file.vm_mmap_pgoff.ksys_mmap_pgoff.do_syscall_64.entry_SYSCALL_64_after_hwframe
>       3.54 ą 14%      -1.7        1.81 ą 10%  perf-profile.calltrace.cycles-pp.apparmor_mmap_file.security_mmap_file.vm_mmap_pgoff.ksys_mmap_pgoff.do_syscall_64
>       3.46 ą  8%      -1.5        1.99 ą  6%  perf-profile.calltrace.cycles-pp.alloc_empty_file.path_openat.do_filp_open.do_sys_openat2.__x64_sys_openat
>       3.15 ą  8%      -1.4        1.71 ą  7%  perf-profile.calltrace.cycles-pp.init_file.alloc_empty_file.path_openat.do_filp_open.do_sys_openat2
>       3.06 ą  9%      -1.4        1.63 ą  7%  perf-profile.calltrace.cycles-pp.security_file_alloc.init_file.alloc_empty_file.path_openat.do_filp_open
>       2.95 ą  9%      -1.4        1.54 ą  8%  perf-profile.calltrace.cycles-pp.apparmor_file_alloc_security.security_file_alloc.init_file.alloc_empty_file.path_openat
>       5.50 ą  7%      -1.1        4.39 ą  5%  perf-profile.calltrace.cycles-pp.fstatat64
>       5.34 ą  7%      -1.1        4.26 ą  6%  perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.fstatat64
>       5.32 ą  7%      -1.1        4.24 ą  6%  perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.fstatat64
>       5.27 ą  8%      -1.1        4.20 ą  6%  perf-profile.calltrace.cycles-pp.__do_sys_newfstatat.do_syscall_64.entry_SYSCALL_64_after_hwframe.fstatat64
>       4.95 ą  8%      -1.0        3.91 ą  7%  perf-profile.calltrace.cycles-pp.vfs_fstat.__do_sys_newfstatat.do_syscall_64.entry_SYSCALL_64_after_hwframe.fstatat64
>       4.78 ą  8%      -1.0        3.77 ą  7%  perf-profile.calltrace.cycles-pp.security_inode_getattr.vfs_fstat.__do_sys_newfstatat.do_syscall_64.entry_SYSCALL_64_after_hwframe
>       4.75 ą  9%      -1.0        3.74 ą  7%  perf-profile.calltrace.cycles-pp.common_perm_cond.security_inode_getattr.vfs_fstat.__do_sys_newfstatat.do_syscall_64
>       1.74 ą 12%      -0.9        0.83 ą 11%  perf-profile.calltrace.cycles-pp.apparmor_file_permission.security_file_permission.rw_verify_area.vfs_read.__x64_sys_pread64
>       1.75 ą 12%      -0.9        0.84 ą 11%  perf-profile.calltrace.cycles-pp.security_file_permission.rw_verify_area.vfs_read.__x64_sys_pread64.do_syscall_64
>       2.08 ą 13%      -0.9        1.17 ą  9%  perf-profile.calltrace.cycles-pp.write
>       1.78 ą 13%      -0.9        0.88 ą 13%  perf-profile.calltrace.cycles-pp.security_file_post_open.do_open.path_openat.do_filp_open.do_sys_openat2
>       1.77 ą 13%      -0.9        0.87 ą 13%  perf-profile.calltrace.cycles-pp.ima_file_check.security_file_post_open.do_open.path_openat.do_filp_open
>       1.68 ą 15%      -0.9        0.80 ą 13%  perf-profile.calltrace.cycles-pp.security_file_permission.rw_verify_area.vfs_read.ksys_read.do_syscall_64
>       1.68 ą 15%      -0.9        0.80 ą 13%  perf-profile.calltrace.cycles-pp.apparmor_file_permission.security_file_permission.rw_verify_area.vfs_read.ksys_read
>       1.68 ą 14%      -0.9        0.80 ą 14%  perf-profile.calltrace.cycles-pp.apparmor_current_getsecid_subj.security_current_getsecid_subj.ima_file_check.security_file_post_open.do_open
>       1.68 ą 14%      -0.9        0.81 ą 14%  perf-profile.calltrace.cycles-pp.security_current_getsecid_subj.ima_file_check.security_file_post_open.do_open.path_openat
>       1.90 ą 14%      -0.9        1.02 ą 10%  perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.write
>       1.88 ą 14%      -0.9        1.00 ą 11%  perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.write
>       1.82 ą 15%      -0.9        0.96 ą 11%  perf-profile.calltrace.cycles-pp.ksys_write.do_syscall_64.entry_SYSCALL_64_after_hwframe.write
>       1.77 ą 15%      -0.8        0.92 ą 11%  perf-profile.calltrace.cycles-pp.vfs_write.ksys_write.do_syscall_64.entry_SYSCALL_64_after_hwframe.write
>       1.74 ą 15%      -0.8        0.90 ą 12%  perf-profile.calltrace.cycles-pp.rw_verify_area.vfs_write.ksys_write.do_syscall_64.entry_SYSCALL_64_after_hwframe
>       1.72 ą 15%      -0.8        0.87 ą 12%  perf-profile.calltrace.cycles-pp.apparmor_file_permission.security_file_permission.rw_verify_area.vfs_write.ksys_write
>       1.73 ą 15%      -0.8        0.89 ą 12%  perf-profile.calltrace.cycles-pp.security_file_permission.rw_verify_area.vfs_write.ksys_write.do_syscall_64
>       1.32 ą  5%      -0.5        0.80 ą  5%  perf-profile.calltrace.cycles-pp.security_file_free.__fput.__x64_sys_close.do_syscall_64.entry_SYSCALL_64_after_hwframe
>       1.31 ą  5%      -0.5        0.80 ą  5%  perf-profile.calltrace.cycles-pp.apparmor_file_free_security.security_file_free.__fput.__x64_sys_close.do_syscall_64
>       2.72 ą  2%      -0.5        2.24 ą  6%  perf-profile.calltrace.cycles-pp.do_mmap.vm_mmap_pgoff.ksys_mmap_pgoff.do_syscall_64.entry_SYSCALL_64_after_hwframe
>       0.68 ą  9%      -0.4        0.26 ą100%  perf-profile.calltrace.cycles-pp.kobject_put.cdev_put.__fput.__x64_sys_close.do_syscall_64
>       2.48 ą  2%      -0.4        2.07 ą  5%  perf-profile.calltrace.cycles-pp.get_unmapped_area.do_mmap.vm_mmap_pgoff.ksys_mmap_pgoff.do_syscall_64
>       2.39 ą  2%      -0.4        1.99 ą  6%  perf-profile.calltrace.cycles-pp.arch_get_unmapped_area_topdown.get_unmapped_area.do_mmap.vm_mmap_pgoff.ksys_mmap_pgoff
>       2.22 ą  2%      -0.4        1.84 ą  5%  perf-profile.calltrace.cycles-pp.vm_unmapped_area.arch_get_unmapped_area_topdown.get_unmapped_area.do_mmap.vm_mmap_pgoff
>       1.54 ą  2%      -0.3        1.27 ą  6%  perf-profile.calltrace.cycles-pp.mas_empty_area_rev.vm_unmapped_area.arch_get_unmapped_area_topdown.get_unmapped_area.do_mmap
>       0.91 ą  8%      -0.2        0.66 ą  6%  perf-profile.calltrace.cycles-pp.cdev_put.__fput.__x64_sys_close.do_syscall_64.entry_SYSCALL_64_after_hwframe
>       1.17 ą  3%      -0.2        0.96 ą  6%  perf-profile.calltrace.cycles-pp.mas_rev_awalk.mas_empty_area_rev.vm_unmapped_area.arch_get_unmapped_area_topdown.get_unmapped_area
>       0.64 ą  2%      -0.1        0.57 ą  4%  perf-profile.calltrace.cycles-pp.ioctl
>       2.80 ą  7%      +1.7        4.48 ą  6%  perf-profile.calltrace.cycles-pp.__libc_pread
>       2.65 ą  7%      +1.7        4.35 ą  7%  perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.__libc_pread
>       2.63 ą  7%      +1.7        4.33 ą  7%  perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.__libc_pread
>       2.58 ą  7%      +1.7        4.29 ą  7%  perf-profile.calltrace.cycles-pp.__x64_sys_pread64.do_syscall_64.entry_SYSCALL_64_after_hwframe.__libc_pread
>       2.79 ą  8%      +1.7        4.50 ą  7%  perf-profile.calltrace.cycles-pp.read
>       2.53 ą  8%      +1.7        4.25 ą  7%  perf-profile.calltrace.cycles-pp.vfs_read.__x64_sys_pread64.do_syscall_64.entry_SYSCALL_64_after_hwframe.__libc_pread
>       2.64 ą  9%      +1.7        4.37 ą  8%  perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.read
>       2.62 ą  9%      +1.7        4.35 ą  8%  perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.read
>       2.57 ą  9%      +1.7        4.31 ą  8%  perf-profile.calltrace.cycles-pp.ksys_read.do_syscall_64.entry_SYSCALL_64_after_hwframe.read
>       2.52 ą 10%      +1.7        4.27 ą  8%  perf-profile.calltrace.cycles-pp.vfs_read.ksys_read.do_syscall_64.entry_SYSCALL_64_after_hwframe.read
>       1.77 ą 12%      +1.9        3.64 ą  8%  perf-profile.calltrace.cycles-pp.rw_verify_area.vfs_read.__x64_sys_pread64.do_syscall_64.entry_SYSCALL_64_after_hwframe
>       1.71 ą 15%      +1.9        3.64 ą  9%  perf-profile.calltrace.cycles-pp.rw_verify_area.vfs_read.ksys_read.do_syscall_64.entry_SYSCALL_64_after_hwframe
>       0.00            +2.8        2.79 ą  5%  perf-profile.calltrace.cycles-pp.fsnotify_open_perm.do_dentry_open.do_open.path_openat.do_filp_open
>       8.50 ą  7%      -2.8        5.66 ą  4%  perf-profile.children.cycles-pp.__mmap
>       7.96 ą  7%      -2.8        5.20 ą  4%  perf-profile.children.cycles-pp.ksys_mmap_pgoff
>       6.81 ą  8%      -2.7        4.14 ą  4%  perf-profile.children.cycles-pp.security_file_open
>       6.79 ą  8%      -2.7        4.14 ą  4%  perf-profile.children.cycles-pp.apparmor_file_open
>       7.48 ą  7%      -2.7        4.83 ą  4%  perf-profile.children.cycles-pp.vm_mmap_pgoff
>       5.14 ą 14%      -2.6        2.51 ą 12%  perf-profile.children.cycles-pp.apparmor_file_permission
>       5.18 ą 14%      -2.6        2.54 ą 11%  perf-profile.children.cycles-pp.security_file_permission
>       4.13 ą 14%      -2.0        2.10 ą 10%  perf-profile.children.cycles-pp.security_mmap_file
>       3.55 ą 14%      -1.7        1.81 ą 10%  perf-profile.children.cycles-pp.apparmor_mmap_file
>       3.47 ą  8%      -1.5        2.00 ą  6%  perf-profile.children.cycles-pp.alloc_empty_file
>       3.15 ą  8%      -1.4        1.72 ą  7%  perf-profile.children.cycles-pp.init_file
>       3.06 ą  9%      -1.4        1.64 ą  7%  perf-profile.children.cycles-pp.security_file_alloc
>       2.95 ą  9%      -1.4        1.55 ą  8%  perf-profile.children.cycles-pp.apparmor_file_alloc_security
>       2.18 ą 16%      -1.2        1.02 ą 14%  perf-profile.children.cycles-pp.security_current_getsecid_subj
>       2.16 ą 16%      -1.2        1.00 ą 14%  perf-profile.children.cycles-pp.apparmor_current_getsecid_subj
>       5.55 ą  7%      -1.1        4.44 ą  5%  perf-profile.children.cycles-pp.fstatat64
>       5.27 ą  8%      -1.1        4.20 ą  6%  perf-profile.children.cycles-pp.__do_sys_newfstatat
>       4.96 ą  8%      -1.0        3.92 ą  7%  perf-profile.children.cycles-pp.vfs_fstat
>       4.78 ą  8%      -1.0        3.77 ą  7%  perf-profile.children.cycles-pp.security_inode_getattr
>       4.75 ą  9%      -1.0        3.74 ą  7%  perf-profile.children.cycles-pp.common_perm_cond
>       2.16 ą 12%      -0.9        1.25 ą  8%  perf-profile.children.cycles-pp.write
>       1.78 ą 13%      -0.9        0.88 ą 13%  perf-profile.children.cycles-pp.security_file_post_open
>       1.77 ą 13%      -0.9        0.87 ą 13%  perf-profile.children.cycles-pp.ima_file_check
>       1.86 ą 14%      -0.9        1.00 ą 10%  perf-profile.children.cycles-pp.ksys_write
>       1.81 ą 15%      -0.8        0.96 ą 10%  perf-profile.children.cycles-pp.vfs_write
>       1.32 ą  5%      -0.5        0.80 ą  5%  perf-profile.children.cycles-pp.security_file_free
>       1.31 ą  5%      -0.5        0.80 ą  5%  perf-profile.children.cycles-pp.apparmor_file_free_security
>       2.73 ą  2%      -0.5        2.25 ą  6%  perf-profile.children.cycles-pp.do_mmap
>       2.50 ą  2%      -0.4        2.08 ą  6%  perf-profile.children.cycles-pp.get_unmapped_area
>       2.41 ą  2%      -0.4        2.01 ą  6%  perf-profile.children.cycles-pp.arch_get_unmapped_area_topdown
>       2.24 ą  2%      -0.4        1.86 ą  5%  perf-profile.children.cycles-pp.vm_unmapped_area
>       0.52 ą 23%      -0.3        0.23 ą 14%  perf-profile.children.cycles-pp.ima_file_mmap
>       1.58 ą  2%      -0.3        1.31 ą  6%  perf-profile.children.cycles-pp.mas_empty_area_rev
>       0.91 ą  7%      -0.2        0.67 ą  6%  perf-profile.children.cycles-pp.cdev_put
>       0.44 ą  3%      -0.2        0.22 ą  6%  perf-profile.children.cycles-pp.__fsnotify_parent
>       1.21 ą  3%      -0.2        0.99 ą  6%  perf-profile.children.cycles-pp.mas_rev_awalk
>       0.69 ą  9%      -0.2        0.50 ą  6%  perf-profile.children.cycles-pp.kobject_put
>       1.13 ą  3%      -0.2        0.96 ą  4%  perf-profile.children.cycles-pp.read_iter_zero
>       1.09 ą  3%      -0.2        0.93 ą  4%  perf-profile.children.cycles-pp.iov_iter_zero
>       0.96 ą  2%      -0.1        0.82 ą  4%  perf-profile.children.cycles-pp.rep_stos_alternative
>       0.76 ą  3%      -0.1        0.64 ą  4%  perf-profile.children.cycles-pp.entry_SYSCALL_64
>       0.21 ą 24%      -0.1        0.11 ą 12%  perf-profile.children.cycles-pp.aa_file_perm
>       0.31 ą  7%      -0.1        0.20 ą  8%  perf-profile.children.cycles-pp.down_write_killable
>       0.75 ą  2%      -0.1        0.66 ą  4%  perf-profile.children.cycles-pp.ioctl
>       0.59 ą  2%      -0.1        0.50 ą  4%  perf-profile.children.cycles-pp.entry_SYSRETQ_unsafe_stack
>       0.31 ą  9%      -0.1        0.23 ą  8%  perf-profile.children.cycles-pp.fget
>       0.52 ą  3%      -0.1        0.44 ą  5%  perf-profile.children.cycles-pp.stress_full
>       0.34            -0.1        0.27 ą  5%  perf-profile.children.cycles-pp.llseek
>       0.30 ą  3%      -0.1        0.24 ą  8%  perf-profile.children.cycles-pp.kmem_cache_free
>       0.34 ą  2%      -0.0        0.29 ą  6%  perf-profile.children.cycles-pp.mas_prev_slot
>       0.29 ą  2%      -0.0        0.24 ą  5%  perf-profile.children.cycles-pp.syscall_exit_to_user_mode
>       0.16 ą  5%      -0.0        0.11 ą  8%  perf-profile.children.cycles-pp.__legitimize_mnt
>       0.16 ą  6%      -0.0        0.12 ą 13%  perf-profile.children.cycles-pp.__memcg_slab_free_hook
>       0.07 ą  5%      -0.0        0.03 ą 81%  perf-profile.children.cycles-pp.ksys_lseek
>       0.25 ą  3%      -0.0        0.22 ą  6%  perf-profile.children.cycles-pp.mas_ascend
>       0.18            -0.0        0.15 ą  5%  perf-profile.children.cycles-pp.mas_data_end
>       0.19 ą  2%      -0.0        0.16 ą  5%  perf-profile.children.cycles-pp.syscall_return_via_sysret
>       0.11 ą  7%      -0.0        0.08 ą  8%  perf-profile.children.cycles-pp.open_last_lookups
>       0.07 ą  4%      -0.0        0.04 ą 50%  perf-profile.children.cycles-pp.mas_prev
>       0.11 ą  4%      -0.0        0.08 ą  9%  perf-profile.children.cycles-pp.__fdget_pos
>       0.07 ą  4%      -0.0        0.04 ą 51%  perf-profile.children.cycles-pp.process_measurement
>       0.06            -0.0        0.04 ą 65%  perf-profile.children.cycles-pp.vfs_getattr_nosec
>       0.06            -0.0        0.04 ą 33%  perf-profile.children.cycles-pp.amd_clear_divider
>       0.08 ą  5%      -0.0        0.06 ą  7%  perf-profile.children.cycles-pp.entry_SYSCALL_64_safe_stack
>       0.07 ą 10%      +0.0        0.10 ą 10%  perf-profile.children.cycles-pp.walk_component
>       0.35            +0.0        0.40 ą  6%  perf-profile.children.cycles-pp.link_path_walk
>      97.57            +0.4       97.94        perf-profile.children.cycles-pp.entry_SYSCALL_64_after_hwframe
>      97.40            +0.4       97.80        perf-profile.children.cycles-pp.do_syscall_64
>       2.85 ą  7%      +1.7        4.53 ą  6%  perf-profile.children.cycles-pp.__libc_pread
>       2.85 ą  8%      +1.7        4.54 ą  7%  perf-profile.children.cycles-pp.read
>       2.59 ą  7%      +1.7        4.30 ą  7%  perf-profile.children.cycles-pp.__x64_sys_pread64
>       2.58 ą  9%      +1.7        4.31 ą  8%  perf-profile.children.cycles-pp.ksys_read
>       0.00            +2.8        2.80 ą  5%  perf-profile.children.cycles-pp.fsnotify_open_perm
>       5.23 ą 14%      +3.0        8.19 ą  8%  perf-profile.children.cycles-pp.rw_verify_area
>       5.06 ą  8%      +3.5        8.53 ą  7%  perf-profile.children.cycles-pp.vfs_read
>       6.77 ą  8%      -2.6        4.12 ą  4%  perf-profile.self.cycles-pp.apparmor_file_open
>       5.01 ą 14%      -2.6        2.44 ą 12%  perf-profile.self.cycles-pp.apparmor_file_permission
>       3.45 ą 13%      -1.7        1.77 ą 10%  perf-profile.self.cycles-pp.apparmor_mmap_file
>       2.93 ą  9%      -1.4        1.54 ą  8%  perf-profile.self.cycles-pp.apparmor_file_alloc_security
>       2.14 ą 16%      -1.2        0.99 ą 14%  perf-profile.self.cycles-pp.apparmor_current_getsecid_subj
>       4.74 ą  9%      -1.0        3.73 ą  7%  perf-profile.self.cycles-pp.common_perm_cond
>       1.31 ą  5%      -0.5        0.79 ą  5%  perf-profile.self.cycles-pp.apparmor_file_free_security
>       0.43 ą  3%      -0.2        0.21 ą  5%  perf-profile.self.cycles-pp.__fsnotify_parent
>       1.07 ą  3%      -0.2        0.88 ą  6%  perf-profile.self.cycles-pp.mas_rev_awalk
>       0.68 ą  9%      -0.2        0.50 ą  6%  perf-profile.self.cycles-pp.kobject_put
>       0.95 ą  2%      -0.1        0.81 ą  4%  perf-profile.self.cycles-pp.rep_stos_alternative
>       0.20 ą 25%      -0.1        0.10 ą 14%  perf-profile.self.cycles-pp.aa_file_perm
>       0.28 ą  8%      -0.1        0.18 ą  8%  perf-profile.self.cycles-pp.down_write_killable
>       0.57 ą  3%      -0.1        0.48 ą  4%  perf-profile.self.cycles-pp.entry_SYSRETQ_unsafe_stack
>       0.31 ą  8%      -0.1        0.22 ą  9%  perf-profile.self.cycles-pp.fget
>       0.50 ą  3%      -0.1        0.43 ą  5%  perf-profile.self.cycles-pp.stress_full
>       0.22 ą  6%      -0.1        0.16 ą  6%  perf-profile.self.cycles-pp.cdev_put
>       0.15 ą  5%      -0.0        0.11 ą  6%  perf-profile.self.cycles-pp.__legitimize_mnt
>       0.24 ą  4%      -0.0        0.20 ą  6%  perf-profile.self.cycles-pp.mas_empty_area_rev
>       0.28 ą  3%      -0.0        0.24 ą  4%  perf-profile.self.cycles-pp.do_syscall_64
>       0.24 ą  3%      -0.0        0.20 ą  6%  perf-profile.self.cycles-pp.mas_ascend
>       0.18 ą  3%      -0.0        0.14 ą  6%  perf-profile.self.cycles-pp.do_mmap
>       0.14 ą  5%      -0.0        0.11 ą 12%  perf-profile.self.cycles-pp.chrdev_open
>       0.19 ą  2%      -0.0        0.15 ą  5%  perf-profile.self.cycles-pp.syscall_return_via_sysret
>       0.20 ą  3%      -0.0        0.17 ą  5%  perf-profile.self.cycles-pp.entry_SYSCALL_64
>       0.20 ą  4%      -0.0        0.17 ą  3%  perf-profile.self.cycles-pp.vfs_read
>       0.18 ą  2%      -0.0        0.15 ą  3%  perf-profile.self.cycles-pp.entry_SYSCALL_64_after_hwframe
>       0.16 ą  2%      -0.0        0.13 ą  4%  perf-profile.self.cycles-pp.mas_data_end
>       0.07 ą  4%      -0.0        0.04 ą 50%  perf-profile.self.cycles-pp.process_measurement
>       0.16 ą  3%      -0.0        0.13 ą  5%  perf-profile.self.cycles-pp.vm_unmapped_area
>       0.12 ą  4%      -0.0        0.09 ą  6%  perf-profile.self.cycles-pp.mas_prev_slot
>       0.14 ą  2%      -0.0        0.12 ą  5%  perf-profile.self.cycles-pp.kmem_cache_free
>       0.10 ą  5%      -0.0        0.07 ą  6%  perf-profile.self.cycles-pp.open64
>       0.15 ą  2%      -0.0        0.13 ą  5%  perf-profile.self.cycles-pp.syscall_exit_to_user_mode
>       0.15 ą  2%      -0.0        0.13 ą  4%  perf-profile.self.cycles-pp.ioctl
>       0.09 ą  5%      -0.0        0.07 ą  8%  perf-profile.self.cycles-pp.write
>       0.07 ą  6%      -0.0        0.06        perf-profile.self.cycles-pp.__close
>       0.11 ą  4%      +0.0        0.13 ą  4%  perf-profile.self.cycles-pp.link_path_walk
>       0.01 ą200%      +0.0        0.06 ą  9%  perf-profile.self.cycles-pp.__virt_addr_valid
>       0.75 ą  2%      +0.1        0.89 ą  3%  perf-profile.self.cycles-pp._raw_spin_lock
>       0.00            +2.8        2.79 ą  5%  perf-profile.self.cycles-pp.fsnotify_open_perm
>       0.05            +5.6        5.63 ą 10%  perf-profile.self.cycles-pp.rw_verify_area
>
>
> ***************************************************************************************************
> lkp-csl-d02: 36 threads 1 sockets Intel(R) Core(TM) i9-10980XE CPU @ 3.00GHz (Cascade Lake) with 128G memory
> =========================================================================================
> compiler/cpufreq_governor/kconfig/nr_task/rootfs/runtime/tbox_group/test/testcase:
>   gcc-13/performance/x86_64-rhel-8.3/1/debian-12-x86_64-20240206.cgz/300s/lkp-csl-d02/fsbuffer-r/unixbench
>
> commit:
>   477cf917dd ("fsnotify: use an enum for group priority constants")
>   a5e57b4d37 ("fsnotify: optimize the case of no permission event watchers")
>
> 477cf917dd02853b a5e57b4d370c6d320e5bfb0c919
> ---------------- ---------------------------
>          %stddev     %change         %stddev
>              \          |                \
>    1339661            +6.4%    1425877        unixbench.throughput
>  5.765e+08            +6.4%  6.131e+08        unixbench.workload
>  1.159e+09            +2.2%  1.184e+09        perf-stat.i.branch-instructions
>       1.49            +0.0        1.54        perf-stat.i.branch-miss-rate%
>   10449249 ą  2%      +6.7%   11149426        perf-stat.i.branch-misses
>       4514            -5.3%       4273        perf-stat.overall.path-length
>  1.156e+09            +2.2%  1.181e+09        perf-stat.ps.branch-instructions
>   10430168 ą  2%      +6.7%   11128869        perf-stat.ps.branch-misses
>       7.02 ą  2%      -3.3        3.70 ą  3%  perf-profile.calltrace.cycles-pp.__fsnotify_parent.vfs_read.ksys_read.do_syscall_64.entry_SYSCALL_64_after_hwframe
>       1.45 ą  3%      +0.2        1.62 ą  3%  perf-profile.calltrace.cycles-pp.syscall_return_via_sysret.read
>       1.24 ą  3%      +0.2        1.44 ą  3%  perf-profile.calltrace.cycles-pp.current_time.atime_needs_update.touch_atime.filemap_read.vfs_read
>       2.55 ą  8%      +0.4        2.91 ą  4%  perf-profile.calltrace.cycles-pp.apparmor_file_permission.security_file_permission.rw_verify_area.vfs_read.ksys_read
>       3.04 ą  6%      +0.4        3.44 ą  3%  perf-profile.calltrace.cycles-pp.security_file_permission.rw_verify_area.vfs_read.ksys_read.do_syscall_64
>       1.94 ą  9%      +0.5        2.42 ą  3%  perf-profile.calltrace.cycles-pp.__fdget_pos.ksys_read.do_syscall_64.entry_SYSCALL_64_after_hwframe.read
>       8.62 ą  3%      +0.5        9.14        perf-profile.calltrace.cycles-pp.filemap_get_pages.filemap_read.vfs_read.ksys_read.do_syscall_64
>       7.90 ą  2%      +0.6        8.51        perf-profile.calltrace.cycles-pp._copy_to_iter.copy_page_to_iter.filemap_read.vfs_read.ksys_read
>       9.29 ą  2%      +0.8       10.04        perf-profile.calltrace.cycles-pp.copy_page_to_iter.filemap_read.vfs_read.ksys_read.do_syscall_64
>       4.43 ą  7%      +0.8        5.28 ą  2%  perf-profile.calltrace.cycles-pp.rep_movs_alternative._copy_to_iter.copy_page_to_iter.filemap_read.vfs_read
>      29.04 ą  3%      +1.8       30.80        perf-profile.calltrace.cycles-pp.filemap_read.vfs_read.ksys_read.do_syscall_64.entry_SYSCALL_64_after_hwframe
>       7.06 ą  2%      -3.3        3.73 ą  3%  perf-profile.children.cycles-pp.__fsnotify_parent
>       0.77 ą  6%      +0.1        0.88 ą  7%  perf-profile.children.cycles-pp.entry_SYSCALL_64_safe_stack
>       1.26 ą  2%      +0.2        1.45 ą  3%  perf-profile.children.cycles-pp.current_time
>       1.66 ą  3%      +0.2        1.90 ą  3%  perf-profile.children.cycles-pp.syscall_return_via_sysret
>       3.72 ą  2%      +0.3        4.03        perf-profile.children.cycles-pp.entry_SYSRETQ_unsafe_stack
>       2.56 ą  7%      +0.4        2.91 ą  4%  perf-profile.children.cycles-pp.apparmor_file_permission
>       5.72 ą  2%      +0.4        6.08        perf-profile.children.cycles-pp.entry_SYSCALL_64
>       4.40 ą  4%      +0.4        4.81 ą  2%  perf-profile.children.cycles-pp.rep_movs_alternative
>       3.10 ą  6%      +0.4        3.52 ą  3%  perf-profile.children.cycles-pp.security_file_permission
>       1.94 ą  9%      +0.5        2.42 ą  3%  perf-profile.children.cycles-pp.__fdget_pos
>       8.68 ą  3%      +0.5        9.20        perf-profile.children.cycles-pp.filemap_get_pages
>       8.37 ą  2%      +0.7        9.05        perf-profile.children.cycles-pp._copy_to_iter
>       9.52 ą  2%      +0.8       10.28        perf-profile.children.cycles-pp.copy_page_to_iter
>      29.25 ą  3%      +1.7       30.99        perf-profile.children.cycles-pp.filemap_read
>       6.94            -3.2        3.72 ą  3%  perf-profile.self.cycles-pp.__fsnotify_parent
>       0.77 ą  6%      +0.1        0.88 ą  7%  perf-profile.self.cycles-pp.entry_SYSCALL_64_safe_stack
>       0.83 ą  5%      +0.1        0.97 ą  7%  perf-profile.self.cycles-pp.current_time
>       1.66 ą  3%      +0.2        1.90 ą  3%  perf-profile.self.cycles-pp.syscall_return_via_sysret
>       3.52 ą  2%      +0.2        3.76        perf-profile.self.cycles-pp.entry_SYSRETQ_unsafe_stack
>       2.42 ą  3%      +0.3        2.67 ą  3%  perf-profile.self.cycles-pp.entry_SYSCALL_64
>       1.92 ą  6%      +0.3        2.20 ą  5%  perf-profile.self.cycles-pp.apparmor_file_permission
>       3.92 ą  4%      +0.3        4.25 ą  2%  perf-profile.self.cycles-pp.rep_movs_alternative
>       4.38            +0.3        4.72 ą  2%  perf-profile.self.cycles-pp._copy_to_iter
>       1.16 ą  8%      +0.3        1.51 ą  2%  perf-profile.self.cycles-pp.ksys_read
>       1.85 ą 10%      +0.5        2.36 ą  2%  perf-profile.self.cycles-pp.__fdget_pos
>
>
>
> ***************************************************************************************************
> lkp-csl-d02: 36 threads 1 sockets Intel(R) Core(TM) i9-10980XE CPU @ 3.00GHz (Cascade Lake) with 128G memory
> =========================================================================================
> compiler/cpufreq_governor/kconfig/nr_task/rootfs/runtime/tbox_group/test/testcase:
>   gcc-13/performance/x86_64-rhel-8.3/1/debian-12-x86_64-20240206.cgz/300s/lkp-csl-d02/fstime-r/unixbench
>
> commit:
>   477cf917dd ("fsnotify: use an enum for group priority constants")
>   a5e57b4d37 ("fsnotify: optimize the case of no permission event watchers")
>
> 477cf917dd02853b a5e57b4d370c6d320e5bfb0c919
> ---------------- ---------------------------
>          %stddev     %change         %stddev
>              \          |                \
>    4709035            +5.8%    4980152        unixbench.throughput
>  2.026e+09            +5.7%  2.141e+09        unixbench.workload
>  1.034e+09            +1.4%  1.048e+09        perf-stat.i.branch-instructions
>       1.56            +0.0        1.59        perf-stat.i.branch-miss-rate%
>   60950726            +5.3%   64193405        perf-stat.i.cache-references
>       0.02 ą 30%     -36.7%       0.01 ą 39%  perf-stat.i.major-faults
>       0.78            -0.0        0.75        perf-stat.overall.cache-miss-rate%
>       1145            -5.4%       1083        perf-stat.overall.path-length
>  1.031e+09            +1.4%  1.046e+09        perf-stat.ps.branch-instructions
>   60812120            +5.3%   64047513        perf-stat.ps.cache-references
>       0.02 ą 30%     -36.7%       0.01 ą 39%  perf-stat.ps.major-faults
>       6.22 ą  3%      -2.9        3.30 ą  3%  perf-profile.calltrace.cycles-pp.__fsnotify_parent.vfs_read.ksys_read.do_syscall_64.entry_SYSCALL_64_after_hwframe
>      49.43            -1.5       47.90        perf-profile.calltrace.cycles-pp.vfs_read.ksys_read.do_syscall_64.entry_SYSCALL_64_after_hwframe.read
>      52.39            -1.0       51.34        perf-profile.calltrace.cycles-pp.ksys_read.do_syscall_64.entry_SYSCALL_64_after_hwframe.read
>      55.16            -0.9       54.29        perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.read
>      56.49            -0.7       55.80        perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.read
>       2.40 ą  4%      +0.2        2.64 ą  5%  perf-profile.calltrace.cycles-pp.atime_needs_update.touch_atime.filemap_read.vfs_read.ksys_read
>       2.59 ą  4%      +0.3        2.86 ą  5%  perf-profile.calltrace.cycles-pp.touch_atime.filemap_read.vfs_read.ksys_read.do_syscall_64
>       6.88            +0.3        7.23 ą  2%  perf-profile.calltrace.cycles-pp.filemap_get_read_batch.filemap_get_pages.filemap_read.vfs_read.ksys_read
>       2.26 ą  3%      +0.4        2.64 ą 10%  perf-profile.calltrace.cycles-pp.apparmor_file_permission.security_file_permission.rw_verify_area.vfs_read.ksys_read
>       7.90 ą  3%      +0.4        8.29        perf-profile.calltrace.cycles-pp.entry_SYSCALL_64.read
>       2.68 ą  2%      +0.4        3.13 ą  8%  perf-profile.calltrace.cycles-pp.security_file_permission.rw_verify_area.vfs_read.ksys_read.do_syscall_64
>       8.47            +0.4        8.91        perf-profile.calltrace.cycles-pp.filemap_get_pages.filemap_read.vfs_read.ksys_read.do_syscall_64
>      32.80            +1.8       34.63        perf-profile.calltrace.cycles-pp.filemap_read.vfs_read.ksys_read.do_syscall_64.entry_SYSCALL_64_after_hwframe
>       6.27 ą  3%      -2.9        3.34 ą  3%  perf-profile.children.cycles-pp.__fsnotify_parent
>      49.50            -1.4       48.07        perf-profile.children.cycles-pp.vfs_read
>      52.46            -1.0       51.45        perf-profile.children.cycles-pp.ksys_read
>       1.16 ą  4%      +0.1        1.28 ą  4%  perf-profile.children.cycles-pp.syscall_exit_to_user_mode
>       2.46 ą  4%      +0.2        2.69 ą  6%  perf-profile.children.cycles-pp.atime_needs_update
>       5.03 ą  3%      +0.3        5.30        perf-profile.children.cycles-pp.entry_SYSCALL_64
>       2.66 ą  4%      +0.3        2.94 ą  6%  perf-profile.children.cycles-pp.touch_atime
>       3.27 ą  2%      +0.3        3.59        perf-profile.children.cycles-pp.entry_SYSRETQ_unsafe_stack
>       6.96            +0.4        7.31 ą  2%  perf-profile.children.cycles-pp.filemap_get_read_batch
>       2.27 ą  3%      +0.4        2.64 ą 10%  perf-profile.children.cycles-pp.apparmor_file_permission
>       2.76 ą  2%      +0.4        3.20 ą  7%  perf-profile.children.cycles-pp.security_file_permission
>       8.52            +0.5        8.98        perf-profile.children.cycles-pp.filemap_get_pages
>      32.99            +1.8       34.80        perf-profile.children.cycles-pp.filemap_read
>       6.16 ą  3%      -2.8        3.32 ą  3%  perf-profile.self.cycles-pp.__fsnotify_parent
>       1.19 ą  3%      -0.4        0.81 ą  6%  perf-profile.self.cycles-pp.rw_verify_area
>       1.55 ą  3%      +0.1        1.64 ą  2%  perf-profile.self.cycles-pp.filemap_get_pages
>       0.70 ą  3%      +0.1        0.81 ą  7%  perf-profile.self.cycles-pp.syscall_exit_to_user_mode
>       1.31 ą  4%      +0.1        1.43 ą  4%  perf-profile.self.cycles-pp.do_syscall_64
>       2.15 ą  4%      +0.1        2.28        perf-profile.self.cycles-pp.entry_SYSCALL_64
>       4.00 ą  2%      +0.2        4.22        perf-profile.self.cycles-pp.read
>       1.06 ą  4%      +0.3        1.31 ą  5%  perf-profile.self.cycles-pp.ksys_read
>       3.09 ą  2%      +0.3        3.36        perf-profile.self.cycles-pp.entry_SYSRETQ_unsafe_stack
>       3.89 ą  2%      +0.3        4.19 ą  3%  perf-profile.self.cycles-pp._copy_to_iter
>       1.66 ą  2%      +0.3        2.01 ą 13%  perf-profile.self.cycles-pp.apparmor_file_permission
>
>
>
>
>
> Disclaimer:
> Results have been estimated based on internal Intel analysis and are provided
> for informational purposes only. Any difference in system hardware or software
> design or configuration may affect actual performance.
>
>
> --
> 0-DAY CI Kernel Test Service
> https://github.com/intel/lkp-tests/wiki
>





[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux