Hello, kernel test robot noticed a 11.7% regression of will-it-scale.per_process_ops on: commit: 89359897983825dbfc08578e7ee807aaf24d9911 ("do_pollfd(): convert to CLASS(fd)") https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master [test faield on linus/master b46c89c08f4146e7987fc355941a93b12e2c03ef] [test failed on linux-next/master 5ffa57f6eecefababb8cbe327222ef171943b183] testcase: will-it-scale config: x86_64-rhel-9.4 compiler: gcc-12 test machine: 104 threads 2 sockets (Skylake) with 192G memory parameters: nr_task: 100% mode: process test: poll2 cpufreq_governor: performance If you fix the issue in a separate patch/commit (i.e. not just a new version of the same patch/commit), kindly add following tags | Reported-by: kernel test robot <oliver.sang@xxxxxxxxx> | Closes: https://lore.kernel.org/oe-lkp/202501261509.b6b4260d-lkp@xxxxxxxxx Details are as below: --------------------------------------------------------------------------------------------------> The kernel config and materials to reproduce are available at: https://download.01.org/0day-ci/archive/20250126/202501261509.b6b4260d-lkp@xxxxxxxxx ========================================================================================= compiler/cpufreq_governor/kconfig/mode/nr_task/rootfs/tbox_group/test/testcase: gcc-12/performance/x86_64-rhel-9.4/process/100%/debian-12-x86_64-20240206.cgz/lkp-skl-fpga01/poll2/will-it-scale commit: d000e073ca ("convert do_select()") 8935989798 ("do_pollfd(): convert to CLASS(fd)") d000e073ca2a08ab 89359897983825dbfc08578e7ee ---------------- --------------------------- %stddev %change %stddev \ | \ 21281 ±147% +197.5% 63313 ± 84% numa-meminfo.node0.Shmem 5318 ±147% +197.5% 15825 ± 84% numa-vmstat.node0.nr_shmem 27370126 -11.7% 24170828 will-it-scale.104.processes 263173 -11.7% 232411 will-it-scale.per_process_ops 27370126 -11.7% 24170828 will-it-scale.workload 0.12 ± 16% -42.1% 0.07 ± 42% perf-sched.sch_delay.avg.ms.schedule_timeout.__wait_for_common.wait_for_completion_state.kernel_clone 4.33 ± 28% +154.2% 11.02 ± 61% perf-sched.sch_delay.max.ms.irqentry_exit_to_user_mode.asm_sysvec_reschedule_ipi.[unknown] 268.62 ± 53% -61.2% 104.10 ±114% perf-sched.wait_and_delay.avg.ms.schedule_hrtimeout_range_clock.ep_poll.do_epoll_wait.__x64_sys_epoll_wait 1053 ± 6% -17.1% 873.33 ± 15% perf-sched.wait_and_delay.count.irqentry_exit_to_user_mode.asm_sysvec_apic_timer_interrupt.[unknown] 1687 ± 10% +11.7% 1884 ± 6% perf-sched.wait_and_delay.count.pipe_read.vfs_read.ksys_read.do_syscall_64 3519 ± 4% +11.2% 3913 ± 5% perf-sched.wait_and_delay.count.syscall_exit_to_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe.[unknown] 8.67 ± 28% +154.2% 22.04 ± 61% perf-sched.wait_and_delay.max.ms.irqentry_exit_to_user_mode.asm_sysvec_reschedule_ipi.[unknown] 268.45 ± 53% -61.4% 103.72 ±115% perf-sched.wait_time.avg.ms.schedule_hrtimeout_range_clock.ep_poll.do_epoll_wait.__x64_sys_epoll_wait 4.33 ± 28% +154.2% 11.02 ± 61% perf-sched.wait_time.max.ms.irqentry_exit_to_user_mode.asm_sysvec_reschedule_ipi.[unknown] 0.01 ± 2% +10.0% 0.01 perf-stat.i.MPKI 5.157e+10 -11.7% 4.554e+10 perf-stat.i.branch-instructions 1.573e+08 -11.8% 1.387e+08 perf-stat.i.branch-misses 0.97 +13.1% 1.09 perf-stat.i.cpi 2.9e+11 -11.7% 2.561e+11 perf-stat.i.instructions 1.04 -11.7% 0.91 perf-stat.i.ipc 0.00 ± 2% +17.9% 0.00 perf-stat.overall.MPKI 0.96 +13.2% 1.09 perf-stat.overall.cpi 1.04 -11.7% 0.92 perf-stat.overall.ipc 5.14e+10 -11.7% 4.538e+10 perf-stat.ps.branch-instructions 1.567e+08 -11.8% 1.382e+08 perf-stat.ps.branch-misses 2.891e+11 -11.7% 2.552e+11 perf-stat.ps.instructions 8.743e+13 -11.7% 7.724e+13 perf-stat.total.instructions 7.61 -0.6 7.03 perf-profile.calltrace.cycles-pp.syscall_return_via_sysret.__poll 6.16 -0.5 5.66 perf-profile.calltrace.cycles-pp.entry_SYSRETQ_unsafe_stack.__poll 5.11 ± 2% -0.5 4.62 ± 2% perf-profile.calltrace.cycles-pp.testcase 2.92 ± 2% -0.4 2.55 ± 2% perf-profile.calltrace.cycles-pp._copy_from_user.do_sys_poll.__x64_sys_poll.do_syscall_64.entry_SYSCALL_64_after_hwframe 2.91 -0.3 2.60 perf-profile.calltrace.cycles-pp.entry_SYSCALL_64.__poll 1.92 ± 5% -0.3 1.67 ± 4% perf-profile.calltrace.cycles-pp.rep_movs_alternative._copy_from_user.do_sys_poll.__x64_sys_poll.do_syscall_64 2.12 -0.2 1.91 perf-profile.calltrace.cycles-pp.__check_object_size.do_sys_poll.__x64_sys_poll.do_syscall_64.entry_SYSCALL_64_after_hwframe 1.32 -0.2 1.17 perf-profile.calltrace.cycles-pp.__kmalloc_noprof.do_sys_poll.__x64_sys_poll.do_syscall_64.entry_SYSCALL_64_after_hwframe 1.84 -0.1 1.72 perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_safe_stack.__poll 0.98 -0.1 0.88 ± 2% perf-profile.calltrace.cycles-pp.check_heap_object.__check_object_size.do_sys_poll.__x64_sys_poll.do_syscall_64 0.97 -0.1 0.88 perf-profile.calltrace.cycles-pp.kfree.do_sys_poll.__x64_sys_poll.do_syscall_64.entry_SYSCALL_64_after_hwframe 0.72 -0.1 0.66 perf-profile.calltrace.cycles-pp.__virt_addr_valid.check_heap_object.__check_object_size.do_sys_poll.__x64_sys_poll 0.62 -0.1 0.57 perf-profile.calltrace.cycles-pp.syscall_exit_to_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe.__poll 94.36 +0.5 94.89 perf-profile.calltrace.cycles-pp.__poll 75.76 +2.0 77.76 perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.__poll 71.45 +2.4 73.83 perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.__poll 69.72 +2.5 72.24 perf-profile.calltrace.cycles-pp.__x64_sys_poll.do_syscall_64.entry_SYSCALL_64_after_hwframe.__poll 69.19 +2.6 71.77 perf-profile.calltrace.cycles-pp.do_sys_poll.__x64_sys_poll.do_syscall_64.entry_SYSCALL_64_after_hwframe.__poll 54.05 +4.1 58.18 perf-profile.calltrace.cycles-pp.do_poll.do_sys_poll.__x64_sys_poll.do_syscall_64.entry_SYSCALL_64_after_hwframe 38.56 +4.5 43.08 perf-profile.calltrace.cycles-pp.fdget.do_poll.do_sys_poll.__x64_sys_poll.do_syscall_64 7.68 -0.6 7.10 perf-profile.children.cycles-pp.syscall_return_via_sysret 6.61 -0.6 6.06 perf-profile.children.cycles-pp.entry_SYSRETQ_unsafe_stack 5.12 ± 2% -0.5 4.64 ± 2% perf-profile.children.cycles-pp.testcase 3.15 ± 2% -0.4 2.74 ± 2% perf-profile.children.cycles-pp._copy_from_user 3.70 -0.4 3.33 perf-profile.children.cycles-pp.entry_SYSCALL_64 1.94 ± 4% -0.3 1.69 ± 4% perf-profile.children.cycles-pp.rep_movs_alternative 2.26 -0.2 2.04 perf-profile.children.cycles-pp.__check_object_size 1.35 -0.2 1.19 perf-profile.children.cycles-pp.__kmalloc_noprof 1.04 -0.1 0.94 perf-profile.children.cycles-pp.check_heap_object 0.97 -0.1 0.88 perf-profile.children.cycles-pp.kfree 1.07 -0.1 1.00 perf-profile.children.cycles-pp.entry_SYSCALL_64_safe_stack 0.74 -0.1 0.66 perf-profile.children.cycles-pp.__virt_addr_valid 0.57 -0.1 0.50 perf-profile.children.cycles-pp.__check_heap_object 0.63 -0.0 0.58 perf-profile.children.cycles-pp.syscall_exit_to_user_mode 0.22 ± 2% -0.0 0.20 ± 2% perf-profile.children.cycles-pp.check_stack_object 0.18 ± 3% -0.0 0.16 perf-profile.children.cycles-pp.__cond_resched 0.07 ± 6% -0.0 0.06 perf-profile.children.cycles-pp.is_vmalloc_addr 0.13 -0.0 0.12 ± 3% perf-profile.children.cycles-pp.x64_sys_call 0.34 -0.0 0.33 perf-profile.children.cycles-pp.__hrtimer_run_queues 0.12 ± 3% -0.0 0.11 perf-profile.children.cycles-pp.rcu_all_qs 94.98 +0.5 95.45 perf-profile.children.cycles-pp.__poll 75.89 +2.0 77.89 perf-profile.children.cycles-pp.entry_SYSCALL_64_after_hwframe 71.52 +2.4 73.89 perf-profile.children.cycles-pp.do_syscall_64 69.78 +2.5 72.29 perf-profile.children.cycles-pp.__x64_sys_poll 69.28 +2.6 71.85 perf-profile.children.cycles-pp.do_sys_poll 54.18 +4.1 58.28 perf-profile.children.cycles-pp.do_poll 38.44 +4.6 43.00 perf-profile.children.cycles-pp.fdget 7.24 -0.6 6.60 perf-profile.self.cycles-pp.do_sys_poll 7.68 -0.6 7.09 perf-profile.self.cycles-pp.syscall_return_via_sysret 16.95 -0.6 16.39 perf-profile.self.cycles-pp.do_poll 6.55 -0.5 6.00 perf-profile.self.cycles-pp.entry_SYSRETQ_unsafe_stack 4.93 ± 2% -0.5 4.46 ± 2% perf-profile.self.cycles-pp.testcase 4.46 -0.4 4.06 perf-profile.self.cycles-pp.entry_SYSCALL_64_after_hwframe 3.25 -0.3 2.92 perf-profile.self.cycles-pp.entry_SYSCALL_64 1.78 ± 5% -0.2 1.54 ± 4% perf-profile.self.cycles-pp.rep_movs_alternative 1.34 -0.2 1.18 perf-profile.self.cycles-pp._copy_from_user 1.16 -0.1 1.02 ± 2% perf-profile.self.cycles-pp.__kmalloc_noprof 0.96 -0.1 0.87 perf-profile.self.cycles-pp.kfree 0.68 -0.1 0.61 ± 2% perf-profile.self.cycles-pp.__virt_addr_valid 0.56 -0.1 0.50 perf-profile.self.cycles-pp.__check_heap_object 0.43 -0.0 0.39 perf-profile.self.cycles-pp.__x64_sys_poll 0.49 -0.0 0.45 perf-profile.self.cycles-pp.syscall_exit_to_user_mode 0.29 ± 2% -0.0 0.26 ± 3% perf-profile.self.cycles-pp.entry_SYSCALL_64_safe_stack 0.19 -0.0 0.17 ± 3% perf-profile.self.cycles-pp.check_stack_object 0.26 -0.0 0.25 ± 3% perf-profile.self.cycles-pp.check_heap_object 0.12 -0.0 0.11 ± 3% perf-profile.self.cycles-pp.x64_sys_call 36.98 +4.6 41.62 perf-profile.self.cycles-pp.fdget Disclaimer: Results have been estimated based on internal Intel analysis and are provided for informational purposes only. Any difference in system hardware or software design or configuration may affect actual performance. -- 0-DAY CI Kernel Test Service https://github.com/intel/lkp-tests/wiki