On Thu 11-04-24 12:23:34, Amir Goldstein wrote: > On Thu, Apr 11, 2024 at 4:42 AM kernel test robot <oliver.sang@xxxxxxxxx> wrote: > > for "[amir73il:fsnotify-sbconn] [fsnotify] 629f30e073: unixbench.throughput 5.8% improvement" > > (https://lore.kernel.org/all/202403141505.807a722b-oliver.sang@xxxxxxxxx/) > > you requested us to test unixbench for this commit on different branches and we > > observed consistent performance improvement. > > > > now we noticed this commit is merged into linux-next/master, we still > > observed similar unixbench improvement, however, we also captured a > > stress-ng regression now. below details FYI. > > > > Hello, > > > > kernel test robot noticed a -17.3% regression of stress-ng.full.ops_per_sec on: > > > > > > commit: a5e57b4d370c6d320e5bfb0c919fe00aee29e039 ("fsnotify: optimize the case of no permission event watchers") > > Odd. This commit does add an extra fsnotify_sb_has_priority_watchers() > inline check for reads and writes, but the inline helper > fsnotify_sb_has_watchers() > already exists in fsnotify_parent() and it already accesses fsnotify_sb_info. > > It seems like stress-ng.full does read/write/mmap operations on /dev/full, > so the fsnotify_sb_info object would be that of devtmpfs. > > I think that the permission events on special files are not very relevant, > but I am not sure. > > Jan, any ideas? So I'm not 100% sure but this load simply seems to run 'stress-ng' with all the syscalls it is able to exercise (one per CPU if I'm right). Hum... looking at perf numbers I've noticed changes like: 0.43 ą 3% -0.2 0.21 ą 5% perf-profile.self.cycles-pp.__fsnotify_parent 0.00 +2.8 2.79 ą 5% perf-profile.self.cycles-pp.fsnotify_open_perm or 1.77 ą 12% +1.9 3.64 ą 8% perf-profile.calltrace.cycles-pp.rw_verify_area.vfs_read.__x64_sys_pread64.do_syscall_64.entry_SYSCALL_64_after_hwframe 1.71 ą 15% +1.9 3.64 ą 9% perf-profile.calltrace.cycles-pp.rw_verify_area.vfs_read.ksys_read.do_syscall_64.entry_SYSCALL_64_after_hwframe 0.00 +2.8 2.79 ą 5% perf-profile.calltrace.cycles-pp.fsnotify_open_perm.do_dentry_open.do_open.path_openat.do_filp_open So the savings in __fsnotify_parent() don't really outweight the costs in fsnotify_file()... I can see stress-ng exercises also inotify so maybe there's some contention on the counters which is causing the regression now that we have more of them? BTW, I'm not sure how you've arrived at the conclusing the test is using /dev/full. For all I can tell the e.g. the stress-mmap test is using a file in a subdir of CWD. Honza > > https://git.kernel.org/cgit/linux/kernel/git/next/linux-next.git master > > > > testcase: stress-ng > > test machine: 64 threads 2 sockets Intel(R) Xeon(R) Gold 6346 CPU @ 3.10GHz (Ice Lake) with 256G memory > > parameters: > > > > nr_threads: 100% > > testtime: 60s > > test: full > > cpufreq_governor: performance > > > > > > In addition to that, the commit also has significant impact on the following tests: > > > > +------------------+-------------------------------------------------------------------------------------------------+ > > | testcase: change | unixbench: unixbench.throughput 6.4% improvement | > > | test machine | 36 threads 1 sockets Intel(R) Core(TM) i9-10980XE CPU @ 3.00GHz (Cascade Lake) with 128G memory | > > | test parameters | cpufreq_governor=performance | > > | | nr_task=1 | > > | | runtime=300s | > > | | test=fsbuffer-r | > > +------------------+-------------------------------------------------------------------------------------------------+ > > | testcase: change | unixbench: unixbench.throughput 5.8% improvement | > > | test machine | 36 threads 1 sockets Intel(R) Core(TM) i9-10980XE CPU @ 3.00GHz (Cascade Lake) with 128G memory | > > | test parameters | cpufreq_governor=performance | > > | | nr_task=1 | > > | | runtime=300s | > > | | test=fstime-r | > > +------------------+-------------------------------------------------------------------------------------------------+ > > > > > > If you fix the issue in a separate patch/commit (i.e. not just a new version of > > the same patch/commit), kindly add following tags > > | Reported-by: kernel test robot <oliver.sang@xxxxxxxxx> > > | Closes: https://lore.kernel.org/oe-lkp/202404101624.85684be8-oliver.sang@xxxxxxxxx > > > > > > Details are as below: > > --------------------------------------------------------------------------------------------------> > > > > > > The kernel config and materials to reproduce are available at: > > https://download.01.org/0day-ci/archive/20240410/202404101624.85684be8-oliver.sang@xxxxxxxxx > > > > ========================================================================================= > > compiler/cpufreq_governor/kconfig/nr_threads/rootfs/tbox_group/test/testcase/testtime: > > gcc-13/performance/x86_64-rhel-8.3/100%/debian-12-x86_64-20240206.cgz/lkp-icl-2sp8/full/stress-ng/60s > > > > commit: > > 477cf917dd ("fsnotify: use an enum for group priority constants") > > a5e57b4d37 ("fsnotify: optimize the case of no permission event watchers") > > > > 477cf917dd02853b a5e57b4d370c6d320e5bfb0c919 > > ---------------- --------------------------- > > %stddev %change %stddev > > \ | \ > > 20489 ą 7% -19.2% 16565 ą 13% perf-c2c.HITM.remote > > 409.48 ą 9% -14.0% 352.13 ą 5% sched_debug.cfs_rq:/.util_est.avg > > 217.94 ą 8% +12.9% 246.07 ą 4% sched_debug.cfs_rq:/.util_est.stddev > > 1.461e+08 ą 3% -17.3% 1.208e+08 ą 5% stress-ng.full.ops > > 2434462 ą 3% -17.3% 2013444 ą 5% stress-ng.full.ops_per_sec > > 71.04 ą 3% -16.6% 59.28 ą 6% stress-ng.time.user_time > > 9.95e+09 ą 4% -13.4% 8.617e+09 ą 3% perf-stat.i.branch-instructions > > 0.48 ą 3% +0.1 0.55 ą 2% perf-stat.i.branch-miss-rate% > > 4.36 ą 4% +17.1% 5.10 ą 3% perf-stat.i.cpi > > 5.162e+10 ą 4% -14.5% 4.416e+10 ą 3% perf-stat.i.instructions > > 0.24 ą 3% -13.8% 0.21 ą 3% perf-stat.i.ipc > > 0.46 ą 3% +0.1 0.54 ą 2% perf-stat.overall.branch-miss-rate% > > 4.38 ą 4% +16.9% 5.12 ą 3% perf-stat.overall.cpi > > 0.23 ą 4% -14.5% 0.20 ą 3% perf-stat.overall.ipc > > 9.781e+09 ą 4% -13.4% 8.471e+09 ą 3% perf-stat.ps.branch-instructions > > 5.075e+10 ą 4% -14.5% 4.341e+10 ą 3% perf-stat.ps.instructions > > 3.111e+12 ą 4% -14.5% 2.66e+12 ą 3% perf-stat.total.instructions > > 8.39 ą 7% -2.8 5.56 ą 4% perf-profile.calltrace.cycles-pp.__mmap > > 8.09 ą 7% -2.8 5.31 ą 4% perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.__mmap > > 8.05 ą 7% -2.8 5.28 ą 4% perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.__mmap > > 7.95 ą 7% -2.8 5.19 ą 4% perf-profile.calltrace.cycles-pp.ksys_mmap_pgoff.do_syscall_64.entry_SYSCALL_64_after_hwframe.__mmap > > 6.80 ą 8% -2.7 4.14 ą 4% perf-profile.calltrace.cycles-pp.security_file_open.do_dentry_open.do_open.path_openat.do_filp_open > > 7.46 ą 8% -2.7 4.80 ą 4% perf-profile.calltrace.cycles-pp.vm_mmap_pgoff.ksys_mmap_pgoff.do_syscall_64.entry_SYSCALL_64_after_hwframe.__mmap > > 6.78 ą 8% -2.7 4.13 ą 4% perf-profile.calltrace.cycles-pp.apparmor_file_open.security_file_open.do_dentry_open.do_open.path_openat > > 4.12 ą 14% -2.0 2.09 ą 10% perf-profile.calltrace.cycles-pp.security_mmap_file.vm_mmap_pgoff.ksys_mmap_pgoff.do_syscall_64.entry_SYSCALL_64_after_hwframe > > 3.54 ą 14% -1.7 1.81 ą 10% perf-profile.calltrace.cycles-pp.apparmor_mmap_file.security_mmap_file.vm_mmap_pgoff.ksys_mmap_pgoff.do_syscall_64 > > 3.46 ą 8% -1.5 1.99 ą 6% perf-profile.calltrace.cycles-pp.alloc_empty_file.path_openat.do_filp_open.do_sys_openat2.__x64_sys_openat > > 3.15 ą 8% -1.4 1.71 ą 7% perf-profile.calltrace.cycles-pp.init_file.alloc_empty_file.path_openat.do_filp_open.do_sys_openat2 > > 3.06 ą 9% -1.4 1.63 ą 7% perf-profile.calltrace.cycles-pp.security_file_alloc.init_file.alloc_empty_file.path_openat.do_filp_open > > 2.95 ą 9% -1.4 1.54 ą 8% perf-profile.calltrace.cycles-pp.apparmor_file_alloc_security.security_file_alloc.init_file.alloc_empty_file.path_openat > > 5.50 ą 7% -1.1 4.39 ą 5% perf-profile.calltrace.cycles-pp.fstatat64 > > 5.34 ą 7% -1.1 4.26 ą 6% perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.fstatat64 > > 5.32 ą 7% -1.1 4.24 ą 6% perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.fstatat64 > > 5.27 ą 8% -1.1 4.20 ą 6% perf-profile.calltrace.cycles-pp.__do_sys_newfstatat.do_syscall_64.entry_SYSCALL_64_after_hwframe.fstatat64 > > 4.95 ą 8% -1.0 3.91 ą 7% perf-profile.calltrace.cycles-pp.vfs_fstat.__do_sys_newfstatat.do_syscall_64.entry_SYSCALL_64_after_hwframe.fstatat64 > > 4.78 ą 8% -1.0 3.77 ą 7% perf-profile.calltrace.cycles-pp.security_inode_getattr.vfs_fstat.__do_sys_newfstatat.do_syscall_64.entry_SYSCALL_64_after_hwframe > > 4.75 ą 9% -1.0 3.74 ą 7% perf-profile.calltrace.cycles-pp.common_perm_cond.security_inode_getattr.vfs_fstat.__do_sys_newfstatat.do_syscall_64 > > 1.74 ą 12% -0.9 0.83 ą 11% perf-profile.calltrace.cycles-pp.apparmor_file_permission.security_file_permission.rw_verify_area.vfs_read.__x64_sys_pread64 > > 1.75 ą 12% -0.9 0.84 ą 11% perf-profile.calltrace.cycles-pp.security_file_permission.rw_verify_area.vfs_read.__x64_sys_pread64.do_syscall_64 > > 2.08 ą 13% -0.9 1.17 ą 9% perf-profile.calltrace.cycles-pp.write > > 1.78 ą 13% -0.9 0.88 ą 13% perf-profile.calltrace.cycles-pp.security_file_post_open.do_open.path_openat.do_filp_open.do_sys_openat2 > > 1.77 ą 13% -0.9 0.87 ą 13% perf-profile.calltrace.cycles-pp.ima_file_check.security_file_post_open.do_open.path_openat.do_filp_open > > 1.68 ą 15% -0.9 0.80 ą 13% perf-profile.calltrace.cycles-pp.security_file_permission.rw_verify_area.vfs_read.ksys_read.do_syscall_64 > > 1.68 ą 15% -0.9 0.80 ą 13% perf-profile.calltrace.cycles-pp.apparmor_file_permission.security_file_permission.rw_verify_area.vfs_read.ksys_read > > 1.68 ą 14% -0.9 0.80 ą 14% perf-profile.calltrace.cycles-pp.apparmor_current_getsecid_subj.security_current_getsecid_subj.ima_file_check.security_file_post_open.do_open > > 1.68 ą 14% -0.9 0.81 ą 14% perf-profile.calltrace.cycles-pp.security_current_getsecid_subj.ima_file_check.security_file_post_open.do_open.path_openat > > 1.90 ą 14% -0.9 1.02 ą 10% perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.write > > 1.88 ą 14% -0.9 1.00 ą 11% perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.write > > 1.82 ą 15% -0.9 0.96 ą 11% perf-profile.calltrace.cycles-pp.ksys_write.do_syscall_64.entry_SYSCALL_64_after_hwframe.write > > 1.77 ą 15% -0.8 0.92 ą 11% perf-profile.calltrace.cycles-pp.vfs_write.ksys_write.do_syscall_64.entry_SYSCALL_64_after_hwframe.write > > 1.74 ą 15% -0.8 0.90 ą 12% perf-profile.calltrace.cycles-pp.rw_verify_area.vfs_write.ksys_write.do_syscall_64.entry_SYSCALL_64_after_hwframe > > 1.72 ą 15% -0.8 0.87 ą 12% perf-profile.calltrace.cycles-pp.apparmor_file_permission.security_file_permission.rw_verify_area.vfs_write.ksys_write > > 1.73 ą 15% -0.8 0.89 ą 12% perf-profile.calltrace.cycles-pp.security_file_permission.rw_verify_area.vfs_write.ksys_write.do_syscall_64 > > 1.32 ą 5% -0.5 0.80 ą 5% perf-profile.calltrace.cycles-pp.security_file_free.__fput.__x64_sys_close.do_syscall_64.entry_SYSCALL_64_after_hwframe > > 1.31 ą 5% -0.5 0.80 ą 5% perf-profile.calltrace.cycles-pp.apparmor_file_free_security.security_file_free.__fput.__x64_sys_close.do_syscall_64 > > 2.72 ą 2% -0.5 2.24 ą 6% perf-profile.calltrace.cycles-pp.do_mmap.vm_mmap_pgoff.ksys_mmap_pgoff.do_syscall_64.entry_SYSCALL_64_after_hwframe > > 0.68 ą 9% -0.4 0.26 ą100% perf-profile.calltrace.cycles-pp.kobject_put.cdev_put.__fput.__x64_sys_close.do_syscall_64 > > 2.48 ą 2% -0.4 2.07 ą 5% perf-profile.calltrace.cycles-pp.get_unmapped_area.do_mmap.vm_mmap_pgoff.ksys_mmap_pgoff.do_syscall_64 > > 2.39 ą 2% -0.4 1.99 ą 6% perf-profile.calltrace.cycles-pp.arch_get_unmapped_area_topdown.get_unmapped_area.do_mmap.vm_mmap_pgoff.ksys_mmap_pgoff > > 2.22 ą 2% -0.4 1.84 ą 5% perf-profile.calltrace.cycles-pp.vm_unmapped_area.arch_get_unmapped_area_topdown.get_unmapped_area.do_mmap.vm_mmap_pgoff > > 1.54 ą 2% -0.3 1.27 ą 6% perf-profile.calltrace.cycles-pp.mas_empty_area_rev.vm_unmapped_area.arch_get_unmapped_area_topdown.get_unmapped_area.do_mmap > > 0.91 ą 8% -0.2 0.66 ą 6% perf-profile.calltrace.cycles-pp.cdev_put.__fput.__x64_sys_close.do_syscall_64.entry_SYSCALL_64_after_hwframe > > 1.17 ą 3% -0.2 0.96 ą 6% perf-profile.calltrace.cycles-pp.mas_rev_awalk.mas_empty_area_rev.vm_unmapped_area.arch_get_unmapped_area_topdown.get_unmapped_area > > 0.64 ą 2% -0.1 0.57 ą 4% perf-profile.calltrace.cycles-pp.ioctl > > 2.80 ą 7% +1.7 4.48 ą 6% perf-profile.calltrace.cycles-pp.__libc_pread > > 2.65 ą 7% +1.7 4.35 ą 7% perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.__libc_pread > > 2.63 ą 7% +1.7 4.33 ą 7% perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.__libc_pread > > 2.58 ą 7% +1.7 4.29 ą 7% perf-profile.calltrace.cycles-pp.__x64_sys_pread64.do_syscall_64.entry_SYSCALL_64_after_hwframe.__libc_pread > > 2.79 ą 8% +1.7 4.50 ą 7% perf-profile.calltrace.cycles-pp.read > > 2.53 ą 8% +1.7 4.25 ą 7% perf-profile.calltrace.cycles-pp.vfs_read.__x64_sys_pread64.do_syscall_64.entry_SYSCALL_64_after_hwframe.__libc_pread > > 2.64 ą 9% +1.7 4.37 ą 8% perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.read > > 2.62 ą 9% +1.7 4.35 ą 8% perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.read > > 2.57 ą 9% +1.7 4.31 ą 8% perf-profile.calltrace.cycles-pp.ksys_read.do_syscall_64.entry_SYSCALL_64_after_hwframe.read > > 2.52 ą 10% +1.7 4.27 ą 8% perf-profile.calltrace.cycles-pp.vfs_read.ksys_read.do_syscall_64.entry_SYSCALL_64_after_hwframe.read > > 1.77 ą 12% +1.9 3.64 ą 8% perf-profile.calltrace.cycles-pp.rw_verify_area.vfs_read.__x64_sys_pread64.do_syscall_64.entry_SYSCALL_64_after_hwframe > > 1.71 ą 15% +1.9 3.64 ą 9% perf-profile.calltrace.cycles-pp.rw_verify_area.vfs_read.ksys_read.do_syscall_64.entry_SYSCALL_64_after_hwframe > > 0.00 +2.8 2.79 ą 5% perf-profile.calltrace.cycles-pp.fsnotify_open_perm.do_dentry_open.do_open.path_openat.do_filp_open > > 8.50 ą 7% -2.8 5.66 ą 4% perf-profile.children.cycles-pp.__mmap > > 7.96 ą 7% -2.8 5.20 ą 4% perf-profile.children.cycles-pp.ksys_mmap_pgoff > > 6.81 ą 8% -2.7 4.14 ą 4% perf-profile.children.cycles-pp.security_file_open > > 6.79 ą 8% -2.7 4.14 ą 4% perf-profile.children.cycles-pp.apparmor_file_open > > 7.48 ą 7% -2.7 4.83 ą 4% perf-profile.children.cycles-pp.vm_mmap_pgoff > > 5.14 ą 14% -2.6 2.51 ą 12% perf-profile.children.cycles-pp.apparmor_file_permission > > 5.18 ą 14% -2.6 2.54 ą 11% perf-profile.children.cycles-pp.security_file_permission > > 4.13 ą 14% -2.0 2.10 ą 10% perf-profile.children.cycles-pp.security_mmap_file > > 3.55 ą 14% -1.7 1.81 ą 10% perf-profile.children.cycles-pp.apparmor_mmap_file > > 3.47 ą 8% -1.5 2.00 ą 6% perf-profile.children.cycles-pp.alloc_empty_file > > 3.15 ą 8% -1.4 1.72 ą 7% perf-profile.children.cycles-pp.init_file > > 3.06 ą 9% -1.4 1.64 ą 7% perf-profile.children.cycles-pp.security_file_alloc > > 2.95 ą 9% -1.4 1.55 ą 8% perf-profile.children.cycles-pp.apparmor_file_alloc_security > > 2.18 ą 16% -1.2 1.02 ą 14% perf-profile.children.cycles-pp.security_current_getsecid_subj > > 2.16 ą 16% -1.2 1.00 ą 14% perf-profile.children.cycles-pp.apparmor_current_getsecid_subj > > 5.55 ą 7% -1.1 4.44 ą 5% perf-profile.children.cycles-pp.fstatat64 > > 5.27 ą 8% -1.1 4.20 ą 6% perf-profile.children.cycles-pp.__do_sys_newfstatat > > 4.96 ą 8% -1.0 3.92 ą 7% perf-profile.children.cycles-pp.vfs_fstat > > 4.78 ą 8% -1.0 3.77 ą 7% perf-profile.children.cycles-pp.security_inode_getattr > > 4.75 ą 9% -1.0 3.74 ą 7% perf-profile.children.cycles-pp.common_perm_cond > > 2.16 ą 12% -0.9 1.25 ą 8% perf-profile.children.cycles-pp.write > > 1.78 ą 13% -0.9 0.88 ą 13% perf-profile.children.cycles-pp.security_file_post_open > > 1.77 ą 13% -0.9 0.87 ą 13% perf-profile.children.cycles-pp.ima_file_check > > 1.86 ą 14% -0.9 1.00 ą 10% perf-profile.children.cycles-pp.ksys_write > > 1.81 ą 15% -0.8 0.96 ą 10% perf-profile.children.cycles-pp.vfs_write > > 1.32 ą 5% -0.5 0.80 ą 5% perf-profile.children.cycles-pp.security_file_free > > 1.31 ą 5% -0.5 0.80 ą 5% perf-profile.children.cycles-pp.apparmor_file_free_security > > 2.73 ą 2% -0.5 2.25 ą 6% perf-profile.children.cycles-pp.do_mmap > > 2.50 ą 2% -0.4 2.08 ą 6% perf-profile.children.cycles-pp.get_unmapped_area > > 2.41 ą 2% -0.4 2.01 ą 6% perf-profile.children.cycles-pp.arch_get_unmapped_area_topdown > > 2.24 ą 2% -0.4 1.86 ą 5% perf-profile.children.cycles-pp.vm_unmapped_area > > 0.52 ą 23% -0.3 0.23 ą 14% perf-profile.children.cycles-pp.ima_file_mmap > > 1.58 ą 2% -0.3 1.31 ą 6% perf-profile.children.cycles-pp.mas_empty_area_rev > > 0.91 ą 7% -0.2 0.67 ą 6% perf-profile.children.cycles-pp.cdev_put > > 0.44 ą 3% -0.2 0.22 ą 6% perf-profile.children.cycles-pp.__fsnotify_parent > > 1.21 ą 3% -0.2 0.99 ą 6% perf-profile.children.cycles-pp.mas_rev_awalk > > 0.69 ą 9% -0.2 0.50 ą 6% perf-profile.children.cycles-pp.kobject_put > > 1.13 ą 3% -0.2 0.96 ą 4% perf-profile.children.cycles-pp.read_iter_zero > > 1.09 ą 3% -0.2 0.93 ą 4% perf-profile.children.cycles-pp.iov_iter_zero > > 0.96 ą 2% -0.1 0.82 ą 4% perf-profile.children.cycles-pp.rep_stos_alternative > > 0.76 ą 3% -0.1 0.64 ą 4% perf-profile.children.cycles-pp.entry_SYSCALL_64 > > 0.21 ą 24% -0.1 0.11 ą 12% perf-profile.children.cycles-pp.aa_file_perm > > 0.31 ą 7% -0.1 0.20 ą 8% perf-profile.children.cycles-pp.down_write_killable > > 0.75 ą 2% -0.1 0.66 ą 4% perf-profile.children.cycles-pp.ioctl > > 0.59 ą 2% -0.1 0.50 ą 4% perf-profile.children.cycles-pp.entry_SYSRETQ_unsafe_stack > > 0.31 ą 9% -0.1 0.23 ą 8% perf-profile.children.cycles-pp.fget > > 0.52 ą 3% -0.1 0.44 ą 5% perf-profile.children.cycles-pp.stress_full > > 0.34 -0.1 0.27 ą 5% perf-profile.children.cycles-pp.llseek > > 0.30 ą 3% -0.1 0.24 ą 8% perf-profile.children.cycles-pp.kmem_cache_free > > 0.34 ą 2% -0.0 0.29 ą 6% perf-profile.children.cycles-pp.mas_prev_slot > > 0.29 ą 2% -0.0 0.24 ą 5% perf-profile.children.cycles-pp.syscall_exit_to_user_mode > > 0.16 ą 5% -0.0 0.11 ą 8% perf-profile.children.cycles-pp.__legitimize_mnt > > 0.16 ą 6% -0.0 0.12 ą 13% perf-profile.children.cycles-pp.__memcg_slab_free_hook > > 0.07 ą 5% -0.0 0.03 ą 81% perf-profile.children.cycles-pp.ksys_lseek > > 0.25 ą 3% -0.0 0.22 ą 6% perf-profile.children.cycles-pp.mas_ascend > > 0.18 -0.0 0.15 ą 5% perf-profile.children.cycles-pp.mas_data_end > > 0.19 ą 2% -0.0 0.16 ą 5% perf-profile.children.cycles-pp.syscall_return_via_sysret > > 0.11 ą 7% -0.0 0.08 ą 8% perf-profile.children.cycles-pp.open_last_lookups > > 0.07 ą 4% -0.0 0.04 ą 50% perf-profile.children.cycles-pp.mas_prev > > 0.11 ą 4% -0.0 0.08 ą 9% perf-profile.children.cycles-pp.__fdget_pos > > 0.07 ą 4% -0.0 0.04 ą 51% perf-profile.children.cycles-pp.process_measurement > > 0.06 -0.0 0.04 ą 65% perf-profile.children.cycles-pp.vfs_getattr_nosec > > 0.06 -0.0 0.04 ą 33% perf-profile.children.cycles-pp.amd_clear_divider > > 0.08 ą 5% -0.0 0.06 ą 7% perf-profile.children.cycles-pp.entry_SYSCALL_64_safe_stack > > 0.07 ą 10% +0.0 0.10 ą 10% perf-profile.children.cycles-pp.walk_component > > 0.35 +0.0 0.40 ą 6% perf-profile.children.cycles-pp.link_path_walk > > 97.57 +0.4 97.94 perf-profile.children.cycles-pp.entry_SYSCALL_64_after_hwframe > > 97.40 +0.4 97.80 perf-profile.children.cycles-pp.do_syscall_64 > > 2.85 ą 7% +1.7 4.53 ą 6% perf-profile.children.cycles-pp.__libc_pread > > 2.85 ą 8% +1.7 4.54 ą 7% perf-profile.children.cycles-pp.read > > 2.59 ą 7% +1.7 4.30 ą 7% perf-profile.children.cycles-pp.__x64_sys_pread64 > > 2.58 ą 9% +1.7 4.31 ą 8% perf-profile.children.cycles-pp.ksys_read > > 0.00 +2.8 2.80 ą 5% perf-profile.children.cycles-pp.fsnotify_open_perm > > 5.23 ą 14% +3.0 8.19 ą 8% perf-profile.children.cycles-pp.rw_verify_area > > 5.06 ą 8% +3.5 8.53 ą 7% perf-profile.children.cycles-pp.vfs_read > > 6.77 ą 8% -2.6 4.12 ą 4% perf-profile.self.cycles-pp.apparmor_file_open > > 5.01 ą 14% -2.6 2.44 ą 12% perf-profile.self.cycles-pp.apparmor_file_permission > > 3.45 ą 13% -1.7 1.77 ą 10% perf-profile.self.cycles-pp.apparmor_mmap_file > > 2.93 ą 9% -1.4 1.54 ą 8% perf-profile.self.cycles-pp.apparmor_file_alloc_security > > 2.14 ą 16% -1.2 0.99 ą 14% perf-profile.self.cycles-pp.apparmor_current_getsecid_subj > > 4.74 ą 9% -1.0 3.73 ą 7% perf-profile.self.cycles-pp.common_perm_cond > > 1.31 ą 5% -0.5 0.79 ą 5% perf-profile.self.cycles-pp.apparmor_file_free_security > > 0.43 ą 3% -0.2 0.21 ą 5% perf-profile.self.cycles-pp.__fsnotify_parent > > 1.07 ą 3% -0.2 0.88 ą 6% perf-profile.self.cycles-pp.mas_rev_awalk > > 0.68 ą 9% -0.2 0.50 ą 6% perf-profile.self.cycles-pp.kobject_put > > 0.95 ą 2% -0.1 0.81 ą 4% perf-profile.self.cycles-pp.rep_stos_alternative > > 0.20 ą 25% -0.1 0.10 ą 14% perf-profile.self.cycles-pp.aa_file_perm > > 0.28 ą 8% -0.1 0.18 ą 8% perf-profile.self.cycles-pp.down_write_killable > > 0.57 ą 3% -0.1 0.48 ą 4% perf-profile.self.cycles-pp.entry_SYSRETQ_unsafe_stack > > 0.31 ą 8% -0.1 0.22 ą 9% perf-profile.self.cycles-pp.fget > > 0.50 ą 3% -0.1 0.43 ą 5% perf-profile.self.cycles-pp.stress_full > > 0.22 ą 6% -0.1 0.16 ą 6% perf-profile.self.cycles-pp.cdev_put > > 0.15 ą 5% -0.0 0.11 ą 6% perf-profile.self.cycles-pp.__legitimize_mnt > > 0.24 ą 4% -0.0 0.20 ą 6% perf-profile.self.cycles-pp.mas_empty_area_rev > > 0.28 ą 3% -0.0 0.24 ą 4% perf-profile.self.cycles-pp.do_syscall_64 > > 0.24 ą 3% -0.0 0.20 ą 6% perf-profile.self.cycles-pp.mas_ascend > > 0.18 ą 3% -0.0 0.14 ą 6% perf-profile.self.cycles-pp.do_mmap > > 0.14 ą 5% -0.0 0.11 ą 12% perf-profile.self.cycles-pp.chrdev_open > > 0.19 ą 2% -0.0 0.15 ą 5% perf-profile.self.cycles-pp.syscall_return_via_sysret > > 0.20 ą 3% -0.0 0.17 ą 5% perf-profile.self.cycles-pp.entry_SYSCALL_64 > > 0.20 ą 4% -0.0 0.17 ą 3% perf-profile.self.cycles-pp.vfs_read > > 0.18 ą 2% -0.0 0.15 ą 3% perf-profile.self.cycles-pp.entry_SYSCALL_64_after_hwframe > > 0.16 ą 2% -0.0 0.13 ą 4% perf-profile.self.cycles-pp.mas_data_end > > 0.07 ą 4% -0.0 0.04 ą 50% perf-profile.self.cycles-pp.process_measurement > > 0.16 ą 3% -0.0 0.13 ą 5% perf-profile.self.cycles-pp.vm_unmapped_area > > 0.12 ą 4% -0.0 0.09 ą 6% perf-profile.self.cycles-pp.mas_prev_slot > > 0.14 ą 2% -0.0 0.12 ą 5% perf-profile.self.cycles-pp.kmem_cache_free > > 0.10 ą 5% -0.0 0.07 ą 6% perf-profile.self.cycles-pp.open64 > > 0.15 ą 2% -0.0 0.13 ą 5% perf-profile.self.cycles-pp.syscall_exit_to_user_mode > > 0.15 ą 2% -0.0 0.13 ą 4% perf-profile.self.cycles-pp.ioctl > > 0.09 ą 5% -0.0 0.07 ą 8% perf-profile.self.cycles-pp.write > > 0.07 ą 6% -0.0 0.06 perf-profile.self.cycles-pp.__close > > 0.11 ą 4% +0.0 0.13 ą 4% perf-profile.self.cycles-pp.link_path_walk > > 0.01 ą200% +0.0 0.06 ą 9% perf-profile.self.cycles-pp.__virt_addr_valid > > 0.75 ą 2% +0.1 0.89 ą 3% perf-profile.self.cycles-pp._raw_spin_lock > > 0.00 +2.8 2.79 ą 5% perf-profile.self.cycles-pp.fsnotify_open_perm > > 0.05 +5.6 5.63 ą 10% perf-profile.self.cycles-pp.rw_verify_area > > > > > > *************************************************************************************************** > > lkp-csl-d02: 36 threads 1 sockets Intel(R) Core(TM) i9-10980XE CPU @ 3.00GHz (Cascade Lake) with 128G memory > > ========================================================================================= > > compiler/cpufreq_governor/kconfig/nr_task/rootfs/runtime/tbox_group/test/testcase: > > gcc-13/performance/x86_64-rhel-8.3/1/debian-12-x86_64-20240206.cgz/300s/lkp-csl-d02/fsbuffer-r/unixbench > > > > commit: > > 477cf917dd ("fsnotify: use an enum for group priority constants") > > a5e57b4d37 ("fsnotify: optimize the case of no permission event watchers") > > > > 477cf917dd02853b a5e57b4d370c6d320e5bfb0c919 > > ---------------- --------------------------- > > %stddev %change %stddev > > \ | \ > > 1339661 +6.4% 1425877 unixbench.throughput > > 5.765e+08 +6.4% 6.131e+08 unixbench.workload > > 1.159e+09 +2.2% 1.184e+09 perf-stat.i.branch-instructions > > 1.49 +0.0 1.54 perf-stat.i.branch-miss-rate% > > 10449249 ą 2% +6.7% 11149426 perf-stat.i.branch-misses > > 4514 -5.3% 4273 perf-stat.overall.path-length > > 1.156e+09 +2.2% 1.181e+09 perf-stat.ps.branch-instructions > > 10430168 ą 2% +6.7% 11128869 perf-stat.ps.branch-misses > > 7.02 ą 2% -3.3 3.70 ą 3% perf-profile.calltrace.cycles-pp.__fsnotify_parent.vfs_read.ksys_read.do_syscall_64.entry_SYSCALL_64_after_hwframe > > 1.45 ą 3% +0.2 1.62 ą 3% perf-profile.calltrace.cycles-pp.syscall_return_via_sysret.read > > 1.24 ą 3% +0.2 1.44 ą 3% perf-profile.calltrace.cycles-pp.current_time.atime_needs_update.touch_atime.filemap_read.vfs_read > > 2.55 ą 8% +0.4 2.91 ą 4% perf-profile.calltrace.cycles-pp.apparmor_file_permission.security_file_permission.rw_verify_area.vfs_read.ksys_read > > 3.04 ą 6% +0.4 3.44 ą 3% perf-profile.calltrace.cycles-pp.security_file_permission.rw_verify_area.vfs_read.ksys_read.do_syscall_64 > > 1.94 ą 9% +0.5 2.42 ą 3% perf-profile.calltrace.cycles-pp.__fdget_pos.ksys_read.do_syscall_64.entry_SYSCALL_64_after_hwframe.read > > 8.62 ą 3% +0.5 9.14 perf-profile.calltrace.cycles-pp.filemap_get_pages.filemap_read.vfs_read.ksys_read.do_syscall_64 > > 7.90 ą 2% +0.6 8.51 perf-profile.calltrace.cycles-pp._copy_to_iter.copy_page_to_iter.filemap_read.vfs_read.ksys_read > > 9.29 ą 2% +0.8 10.04 perf-profile.calltrace.cycles-pp.copy_page_to_iter.filemap_read.vfs_read.ksys_read.do_syscall_64 > > 4.43 ą 7% +0.8 5.28 ą 2% perf-profile.calltrace.cycles-pp.rep_movs_alternative._copy_to_iter.copy_page_to_iter.filemap_read.vfs_read > > 29.04 ą 3% +1.8 30.80 perf-profile.calltrace.cycles-pp.filemap_read.vfs_read.ksys_read.do_syscall_64.entry_SYSCALL_64_after_hwframe > > 7.06 ą 2% -3.3 3.73 ą 3% perf-profile.children.cycles-pp.__fsnotify_parent > > 0.77 ą 6% +0.1 0.88 ą 7% perf-profile.children.cycles-pp.entry_SYSCALL_64_safe_stack > > 1.26 ą 2% +0.2 1.45 ą 3% perf-profile.children.cycles-pp.current_time > > 1.66 ą 3% +0.2 1.90 ą 3% perf-profile.children.cycles-pp.syscall_return_via_sysret > > 3.72 ą 2% +0.3 4.03 perf-profile.children.cycles-pp.entry_SYSRETQ_unsafe_stack > > 2.56 ą 7% +0.4 2.91 ą 4% perf-profile.children.cycles-pp.apparmor_file_permission > > 5.72 ą 2% +0.4 6.08 perf-profile.children.cycles-pp.entry_SYSCALL_64 > > 4.40 ą 4% +0.4 4.81 ą 2% perf-profile.children.cycles-pp.rep_movs_alternative > > 3.10 ą 6% +0.4 3.52 ą 3% perf-profile.children.cycles-pp.security_file_permission > > 1.94 ą 9% +0.5 2.42 ą 3% perf-profile.children.cycles-pp.__fdget_pos > > 8.68 ą 3% +0.5 9.20 perf-profile.children.cycles-pp.filemap_get_pages > > 8.37 ą 2% +0.7 9.05 perf-profile.children.cycles-pp._copy_to_iter > > 9.52 ą 2% +0.8 10.28 perf-profile.children.cycles-pp.copy_page_to_iter > > 29.25 ą 3% +1.7 30.99 perf-profile.children.cycles-pp.filemap_read > > 6.94 -3.2 3.72 ą 3% perf-profile.self.cycles-pp.__fsnotify_parent > > 0.77 ą 6% +0.1 0.88 ą 7% perf-profile.self.cycles-pp.entry_SYSCALL_64_safe_stack > > 0.83 ą 5% +0.1 0.97 ą 7% perf-profile.self.cycles-pp.current_time > > 1.66 ą 3% +0.2 1.90 ą 3% perf-profile.self.cycles-pp.syscall_return_via_sysret > > 3.52 ą 2% +0.2 3.76 perf-profile.self.cycles-pp.entry_SYSRETQ_unsafe_stack > > 2.42 ą 3% +0.3 2.67 ą 3% perf-profile.self.cycles-pp.entry_SYSCALL_64 > > 1.92 ą 6% +0.3 2.20 ą 5% perf-profile.self.cycles-pp.apparmor_file_permission > > 3.92 ą 4% +0.3 4.25 ą 2% perf-profile.self.cycles-pp.rep_movs_alternative > > 4.38 +0.3 4.72 ą 2% perf-profile.self.cycles-pp._copy_to_iter > > 1.16 ą 8% +0.3 1.51 ą 2% perf-profile.self.cycles-pp.ksys_read > > 1.85 ą 10% +0.5 2.36 ą 2% perf-profile.self.cycles-pp.__fdget_pos > > > > > > > > *************************************************************************************************** > > lkp-csl-d02: 36 threads 1 sockets Intel(R) Core(TM) i9-10980XE CPU @ 3.00GHz (Cascade Lake) with 128G memory > > ========================================================================================= > > compiler/cpufreq_governor/kconfig/nr_task/rootfs/runtime/tbox_group/test/testcase: > > gcc-13/performance/x86_64-rhel-8.3/1/debian-12-x86_64-20240206.cgz/300s/lkp-csl-d02/fstime-r/unixbench > > > > commit: > > 477cf917dd ("fsnotify: use an enum for group priority constants") > > a5e57b4d37 ("fsnotify: optimize the case of no permission event watchers") > > > > 477cf917dd02853b a5e57b4d370c6d320e5bfb0c919 > > ---------------- --------------------------- > > %stddev %change %stddev > > \ | \ > > 4709035 +5.8% 4980152 unixbench.throughput > > 2.026e+09 +5.7% 2.141e+09 unixbench.workload > > 1.034e+09 +1.4% 1.048e+09 perf-stat.i.branch-instructions > > 1.56 +0.0 1.59 perf-stat.i.branch-miss-rate% > > 60950726 +5.3% 64193405 perf-stat.i.cache-references > > 0.02 ą 30% -36.7% 0.01 ą 39% perf-stat.i.major-faults > > 0.78 -0.0 0.75 perf-stat.overall.cache-miss-rate% > > 1145 -5.4% 1083 perf-stat.overall.path-length > > 1.031e+09 +1.4% 1.046e+09 perf-stat.ps.branch-instructions > > 60812120 +5.3% 64047513 perf-stat.ps.cache-references > > 0.02 ą 30% -36.7% 0.01 ą 39% perf-stat.ps.major-faults > > 6.22 ą 3% -2.9 3.30 ą 3% perf-profile.calltrace.cycles-pp.__fsnotify_parent.vfs_read.ksys_read.do_syscall_64.entry_SYSCALL_64_after_hwframe > > 49.43 -1.5 47.90 perf-profile.calltrace.cycles-pp.vfs_read.ksys_read.do_syscall_64.entry_SYSCALL_64_after_hwframe.read > > 52.39 -1.0 51.34 perf-profile.calltrace.cycles-pp.ksys_read.do_syscall_64.entry_SYSCALL_64_after_hwframe.read > > 55.16 -0.9 54.29 perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.read > > 56.49 -0.7 55.80 perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.read > > 2.40 ą 4% +0.2 2.64 ą 5% perf-profile.calltrace.cycles-pp.atime_needs_update.touch_atime.filemap_read.vfs_read.ksys_read > > 2.59 ą 4% +0.3 2.86 ą 5% perf-profile.calltrace.cycles-pp.touch_atime.filemap_read.vfs_read.ksys_read.do_syscall_64 > > 6.88 +0.3 7.23 ą 2% perf-profile.calltrace.cycles-pp.filemap_get_read_batch.filemap_get_pages.filemap_read.vfs_read.ksys_read > > 2.26 ą 3% +0.4 2.64 ą 10% perf-profile.calltrace.cycles-pp.apparmor_file_permission.security_file_permission.rw_verify_area.vfs_read.ksys_read > > 7.90 ą 3% +0.4 8.29 perf-profile.calltrace.cycles-pp.entry_SYSCALL_64.read > > 2.68 ą 2% +0.4 3.13 ą 8% perf-profile.calltrace.cycles-pp.security_file_permission.rw_verify_area.vfs_read.ksys_read.do_syscall_64 > > 8.47 +0.4 8.91 perf-profile.calltrace.cycles-pp.filemap_get_pages.filemap_read.vfs_read.ksys_read.do_syscall_64 > > 32.80 +1.8 34.63 perf-profile.calltrace.cycles-pp.filemap_read.vfs_read.ksys_read.do_syscall_64.entry_SYSCALL_64_after_hwframe > > 6.27 ą 3% -2.9 3.34 ą 3% perf-profile.children.cycles-pp.__fsnotify_parent > > 49.50 -1.4 48.07 perf-profile.children.cycles-pp.vfs_read > > 52.46 -1.0 51.45 perf-profile.children.cycles-pp.ksys_read > > 1.16 ą 4% +0.1 1.28 ą 4% perf-profile.children.cycles-pp.syscall_exit_to_user_mode > > 2.46 ą 4% +0.2 2.69 ą 6% perf-profile.children.cycles-pp.atime_needs_update > > 5.03 ą 3% +0.3 5.30 perf-profile.children.cycles-pp.entry_SYSCALL_64 > > 2.66 ą 4% +0.3 2.94 ą 6% perf-profile.children.cycles-pp.touch_atime > > 3.27 ą 2% +0.3 3.59 perf-profile.children.cycles-pp.entry_SYSRETQ_unsafe_stack > > 6.96 +0.4 7.31 ą 2% perf-profile.children.cycles-pp.filemap_get_read_batch > > 2.27 ą 3% +0.4 2.64 ą 10% perf-profile.children.cycles-pp.apparmor_file_permission > > 2.76 ą 2% +0.4 3.20 ą 7% perf-profile.children.cycles-pp.security_file_permission > > 8.52 +0.5 8.98 perf-profile.children.cycles-pp.filemap_get_pages > > 32.99 +1.8 34.80 perf-profile.children.cycles-pp.filemap_read > > 6.16 ą 3% -2.8 3.32 ą 3% perf-profile.self.cycles-pp.__fsnotify_parent > > 1.19 ą 3% -0.4 0.81 ą 6% perf-profile.self.cycles-pp.rw_verify_area > > 1.55 ą 3% +0.1 1.64 ą 2% perf-profile.self.cycles-pp.filemap_get_pages > > 0.70 ą 3% +0.1 0.81 ą 7% perf-profile.self.cycles-pp.syscall_exit_to_user_mode > > 1.31 ą 4% +0.1 1.43 ą 4% perf-profile.self.cycles-pp.do_syscall_64 > > 2.15 ą 4% +0.1 2.28 perf-profile.self.cycles-pp.entry_SYSCALL_64 > > 4.00 ą 2% +0.2 4.22 perf-profile.self.cycles-pp.read > > 1.06 ą 4% +0.3 1.31 ą 5% perf-profile.self.cycles-pp.ksys_read > > 3.09 ą 2% +0.3 3.36 perf-profile.self.cycles-pp.entry_SYSRETQ_unsafe_stack > > 3.89 ą 2% +0.3 4.19 ą 3% perf-profile.self.cycles-pp._copy_to_iter > > 1.66 ą 2% +0.3 2.01 ą 13% perf-profile.self.cycles-pp.apparmor_file_permission > > > > > > > > > > > > Disclaimer: > > Results have been estimated based on internal Intel analysis and are provided > > for informational purposes only. Any difference in system hardware or software > > design or configuration may affect actual performance. > > > > > > -- > > 0-DAY CI Kernel Test Service > > https://github.com/intel/lkp-tests/wiki > > > -- Jan Kara <jack@xxxxxxxx> SUSE Labs, CR