hi, Al Viro, On Mon, Jan 27, 2025 at 07:26:16PM +0000, Al Viro wrote: > On Sun, Jan 26, 2025 at 04:16:04PM +0800, kernel test robot wrote: > > > > > > Hello, > > > > kernel test robot noticed a 11.7% regression of will-it-scale.per_process_ops on: > > > > > > commit: 89359897983825dbfc08578e7ee807aaf24d9911 ("do_pollfd(): convert to CLASS(fd)") > > https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master > > > > [test faield on linus/master b46c89c08f4146e7987fc355941a93b12e2c03ef] > > [test failed on linux-next/master 5ffa57f6eecefababb8cbe327222ef171943b183] > > > > testcase: will-it-scale > > config: x86_64-rhel-9.4 > > compiler: gcc-12 > > test machine: 104 threads 2 sockets (Skylake) with 192G memory > > parameters: > > > > nr_task: 100% > > mode: process > > test: poll2 > > cpufreq_governor: performance > > > > > > > > > > If you fix the issue in a separate patch/commit (i.e. not just a new version of > > the same patch/commit), kindly add following tags > > | Reported-by: kernel test robot <oliver.sang@xxxxxxxxx> > > | Closes: https://lore.kernel.org/oe-lkp/202501261509.b6b4260d-lkp@xxxxxxxxx > > > > > > Details are as below: > > --------------------------------------------------------------------------------------------------> > > > > > > The kernel config and materials to reproduce are available at: > > https://download.01.org/0day-ci/archive/20250126/202501261509.b6b4260d-lkp@xxxxxxxxx > > Very interesting... Looking at the generated asm, two things seem to > change in there- "we need an fput()" case in (now implicit) fdput() in > do_pollfd() is no longer out of line and slightly different spills are > done in do_poll(). > > Just to make sure it's not a geniune change of logics somewhere, > could you compare d000e073ca2a, 893598979838 and d000e073ca2a with the > delta below? That delta provably is an equivalent transformation - all > exits from do_pollfd() go through the return in the end, so that just > shifts the last assignment in there into the caller. the 'd000e073ca2a with the delta below' has just very similar score as d000e073ca2a as below. Tested-by: kernel test robot <oliver.sang@xxxxxxxxx> ========================================================================================= compiler/cpufreq_governor/kconfig/mode/nr_task/rootfs/tbox_group/test/testcase: gcc-12/performance/x86_64-rhel-9.4/process/100%/debian-12-x86_64-20240206.cgz/lkp-skl-fpga01/poll2/will-it-scale commit: d000e073ca ("convert do_select()") 8935989798 ("do_pollfd(): convert to CLASS(fd)") 2c43a225261 <--- d000e073ca with the delta below d000e073ca2a08ab 89359897983825dbfc08578e7ee 2c43a2252614bf1692ef2ad5a46 ---------------- --------------------------- --------------------------- %stddev %change %stddev %change %stddev \ | \ | \ 263173 -11.7% 232411 -0.5% 261953 will-it-scale.per_process_ops below full comparison FYI. ========================================================================================= compiler/cpufreq_governor/kconfig/mode/nr_task/rootfs/tbox_group/test/testcase: gcc-12/performance/x86_64-rhel-9.4/process/100%/debian-12-x86_64-20240206.cgz/lkp-skl-fpga01/poll2/will-it-scale commit: d000e073ca ("convert do_select()") 8935989798 ("do_pollfd(): convert to CLASS(fd)") 2c43a225261 <--- d000e073ca with the delta below d000e073ca2a08ab 89359897983825dbfc08578e7ee 2c43a2252614bf1692ef2ad5a46 ---------------- --------------------------- --------------------------- %stddev %change %stddev %change %stddev \ | \ | \ 1.98e+08 ± 12% +15.7% 2.29e+08 ± 18% -13.1% 1.721e+08 cpuidle..time 21281 ±147% +197.5% 63313 ± 84% +180.7% 59731 ± 86% numa-meminfo.node0.Shmem 5318 ±147% +197.5% 15825 ± 84% +180.7% 14930 ± 86% numa-vmstat.node0.nr_shmem 88607 +0.2% 88803 -1.5% 87297 proc-vmstat.nr_shmem 11118 ± 15% +13.6% 12633 ± 51% -27.7% 8034 ± 10% proc-vmstat.numa_hint_faults_local 21894 ± 4% +135.8% 51630 ±124% +144.5% 53539 ±117% sched_debug.cfs_rq:/.load.max 2575 ± 4% +106.7% 5323 ±112% +115.5% 5548 ±106% sched_debug.cfs_rq:/.load.stddev 3940 ± 18% -19.1% 3188 ± 8% -25.5% 2933 ± 20% sched_debug.cpu.avg_idle.min 27370126 -11.7% 24170828 -0.5% 27243222 will-it-scale.104.processes 263173 -11.7% 232411 -0.5% 261953 will-it-scale.per_process_ops 27370126 -11.7% 24170828 -0.5% 27243222 will-it-scale.workload 0.12 ± 16% -42.1% 0.07 ± 42% -36.3% 0.07 ± 35% perf-sched.sch_delay.avg.ms.schedule_timeout.__wait_for_common.wait_for_completion_state.kernel_clone 4.33 ± 28% +154.2% 11.02 ± 61% +86.2% 8.07 ± 83% perf-sched.sch_delay.max.ms.irqentry_exit_to_user_mode.asm_sysvec_reschedule_ipi.[unknown] 2.27 ± 22% -34.2% 1.49 ± 66% -48.9% 1.16 ± 36% perf-sched.sch_delay.max.ms.schedule_timeout.__wait_for_common.wait_for_completion_state.kernel_clone 268.62 ± 53% -61.2% 104.10 ±114% -39.4% 162.90 ± 82% perf-sched.wait_and_delay.avg.ms.schedule_hrtimeout_range_clock.ep_poll.do_epoll_wait.__x64_sys_epoll_wait 1053 ± 6% -17.1% 873.33 ± 15% -4.6% 1004 ± 11% perf-sched.wait_and_delay.count.irqentry_exit_to_user_mode.asm_sysvec_apic_timer_interrupt.[unknown] 1687 ± 10% +11.7% 1884 ± 6% +5.3% 1777 ± 10% perf-sched.wait_and_delay.count.pipe_read.vfs_read.ksys_read.do_syscall_64 3519 ± 4% +11.2% 3913 ± 5% +3.9% 3656 ± 5% perf-sched.wait_and_delay.count.syscall_exit_to_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe.[unknown] 8.67 ± 28% +154.2% 22.04 ± 61% +86.2% 16.14 ± 83% perf-sched.wait_and_delay.max.ms.irqentry_exit_to_user_mode.asm_sysvec_reschedule_ipi.[unknown] 268.45 ± 53% -61.4% 103.72 ±115% -39.5% 162.49 ± 83% perf-sched.wait_time.avg.ms.schedule_hrtimeout_range_clock.ep_poll.do_epoll_wait.__x64_sys_epoll_wait 4.33 ± 28% +154.2% 11.02 ± 61% +86.2% 8.07 ± 83% perf-sched.wait_time.max.ms.irqentry_exit_to_user_mode.asm_sysvec_reschedule_ipi.[unknown] 0.01 ± 2% +10.0% 0.01 +0.7% 0.01 ± 2% perf-stat.i.MPKI 5.157e+10 -11.7% 4.554e+10 -0.5% 5.133e+10 perf-stat.i.branch-instructions 1.573e+08 -11.8% 1.387e+08 +0.0% 1.573e+08 perf-stat.i.branch-misses 0.97 +13.1% 1.09 +0.2% 0.97 perf-stat.i.cpi 2.9e+11 -11.7% 2.561e+11 -0.5% 2.887e+11 perf-stat.i.instructions 1.04 -11.7% 0.91 -0.2% 1.03 perf-stat.i.ipc 0.00 ± 2% +17.9% 0.00 +1.4% 0.00 ± 3% perf-stat.overall.MPKI 0.96 +13.2% 1.09 +0.2% 0.97 perf-stat.overall.cpi 1.04 -11.7% 0.92 -0.2% 1.03 perf-stat.overall.ipc 5.14e+10 -11.7% 4.538e+10 -0.5% 5.116e+10 perf-stat.ps.branch-instructions 1.567e+08 -11.8% 1.382e+08 +0.0% 1.568e+08 perf-stat.ps.branch-misses 2.891e+11 -11.7% 2.552e+11 -0.5% 2.877e+11 perf-stat.ps.instructions 8.743e+13 -11.7% 7.724e+13 -0.5% 8.699e+13 perf-stat.total.instructions 7.61 -0.6 7.03 +0.0 7.63 perf-profile.calltrace.cycles-pp.syscall_return_via_sysret.__poll 6.16 -0.5 5.66 -0.0 6.13 perf-profile.calltrace.cycles-pp.entry_SYSRETQ_unsafe_stack.__poll 5.11 ± 2% -0.5 4.62 ± 2% +0.3 5.44 perf-profile.calltrace.cycles-pp.testcase 2.92 ± 2% -0.4 2.55 ± 2% -0.1 2.85 perf-profile.calltrace.cycles-pp._copy_from_user.do_sys_poll.__x64_sys_poll.do_syscall_64.entry_SYSCALL_64_after_hwframe 2.91 -0.3 2.60 +0.0 2.93 perf-profile.calltrace.cycles-pp.entry_SYSCALL_64.__poll 1.92 ± 5% -0.3 1.67 ± 4% -0.1 1.84 perf-profile.calltrace.cycles-pp.rep_movs_alternative._copy_from_user.do_sys_poll.__x64_sys_poll.do_syscall_64 2.12 -0.2 1.91 -0.0 2.10 perf-profile.calltrace.cycles-pp.__check_object_size.do_sys_poll.__x64_sys_poll.do_syscall_64.entry_SYSCALL_64_after_hwframe 1.32 -0.2 1.17 -0.0 1.30 perf-profile.calltrace.cycles-pp.__kmalloc_noprof.do_sys_poll.__x64_sys_poll.do_syscall_64.entry_SYSCALL_64_after_hwframe 1.84 -0.1 1.72 +0.0 1.85 perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_safe_stack.__poll 0.98 -0.1 0.88 ± 2% -0.0 0.97 perf-profile.calltrace.cycles-pp.check_heap_object.__check_object_size.do_sys_poll.__x64_sys_poll.do_syscall_64 0.97 -0.1 0.88 -0.0 0.94 ± 4% perf-profile.calltrace.cycles-pp.kfree.do_sys_poll.__x64_sys_poll.do_syscall_64.entry_SYSCALL_64_after_hwframe 0.72 -0.1 0.66 -0.0 0.72 perf-profile.calltrace.cycles-pp.__virt_addr_valid.check_heap_object.__check_object_size.do_sys_poll.__x64_sys_poll 0.62 -0.1 0.57 +0.0 0.62 perf-profile.calltrace.cycles-pp.syscall_exit_to_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe.__poll 94.36 +0.5 94.89 -0.3 94.03 perf-profile.calltrace.cycles-pp.__poll 75.76 +2.0 77.76 -0.3 75.45 perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.__poll 71.45 +2.4 73.83 -0.3 71.12 perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.__poll 69.72 +2.5 72.24 -0.4 69.32 perf-profile.calltrace.cycles-pp.__x64_sys_poll.do_syscall_64.entry_SYSCALL_64_after_hwframe.__poll 69.19 +2.6 71.77 -0.4 68.80 perf-profile.calltrace.cycles-pp.do_sys_poll.__x64_sys_poll.do_syscall_64.entry_SYSCALL_64_after_hwframe.__poll 54.05 +4.1 58.18 -0.2 53.85 perf-profile.calltrace.cycles-pp.do_poll.do_sys_poll.__x64_sys_poll.do_syscall_64.entry_SYSCALL_64_after_hwframe 38.56 +4.5 43.08 -0.2 38.35 perf-profile.calltrace.cycles-pp.fdget.do_poll.do_sys_poll.__x64_sys_poll.do_syscall_64 7.68 -0.6 7.10 +0.0 7.70 perf-profile.children.cycles-pp.syscall_return_via_sysret 6.61 -0.6 6.06 -0.0 6.59 perf-profile.children.cycles-pp.entry_SYSRETQ_unsafe_stack 5.12 ± 2% -0.5 4.64 ± 2% +0.3 5.45 perf-profile.children.cycles-pp.testcase 3.15 ± 2% -0.4 2.74 ± 2% -0.1 3.07 perf-profile.children.cycles-pp._copy_from_user 3.70 -0.4 3.33 +0.0 3.72 perf-profile.children.cycles-pp.entry_SYSCALL_64 1.94 ± 4% -0.3 1.69 ± 4% -0.1 1.86 perf-profile.children.cycles-pp.rep_movs_alternative 2.26 -0.2 2.04 -0.0 2.25 perf-profile.children.cycles-pp.__check_object_size 1.35 -0.2 1.19 -0.0 1.33 perf-profile.children.cycles-pp.__kmalloc_noprof 1.04 -0.1 0.94 -0.0 1.04 perf-profile.children.cycles-pp.check_heap_object 0.97 -0.1 0.88 -0.0 0.94 ± 4% perf-profile.children.cycles-pp.kfree 1.07 -0.1 1.00 +0.0 1.08 perf-profile.children.cycles-pp.entry_SYSCALL_64_safe_stack 0.74 -0.1 0.66 -0.0 0.73 perf-profile.children.cycles-pp.__virt_addr_valid 0.57 -0.1 0.50 -0.0 0.56 ± 2% perf-profile.children.cycles-pp.__check_heap_object 0.63 -0.0 0.58 +0.0 0.63 perf-profile.children.cycles-pp.syscall_exit_to_user_mode 0.22 ± 2% -0.0 0.20 ± 2% +0.0 0.23 ± 3% perf-profile.children.cycles-pp.check_stack_object 0.18 ± 3% -0.0 0.16 -0.0 0.17 ± 2% perf-profile.children.cycles-pp.__cond_resched 0.07 ± 6% -0.0 0.06 -0.0 0.07 ± 5% perf-profile.children.cycles-pp.is_vmalloc_addr 0.13 -0.0 0.12 ± 3% +0.0 0.13 perf-profile.children.cycles-pp.x64_sys_call 0.34 -0.0 0.33 -0.0 0.34 ± 2% perf-profile.children.cycles-pp.__hrtimer_run_queues 0.12 ± 3% -0.0 0.11 -0.0 0.12 ± 3% perf-profile.children.cycles-pp.rcu_all_qs 94.98 +0.5 95.45 -0.3 94.65 perf-profile.children.cycles-pp.__poll 75.89 +2.0 77.89 -0.3 75.58 perf-profile.children.cycles-pp.entry_SYSCALL_64_after_hwframe 71.52 +2.4 73.89 -0.3 71.19 perf-profile.children.cycles-pp.do_syscall_64 69.78 +2.5 72.29 -0.4 69.38 perf-profile.children.cycles-pp.__x64_sys_poll 69.28 +2.6 71.85 -0.4 68.89 perf-profile.children.cycles-pp.do_sys_poll 54.18 +4.1 58.28 -0.2 53.99 perf-profile.children.cycles-pp.do_poll 38.44 +4.6 43.00 -0.2 38.24 perf-profile.children.cycles-pp.fdget 7.24 -0.6 6.60 -0.1 7.19 perf-profile.self.cycles-pp.do_sys_poll 7.68 -0.6 7.09 +0.0 7.70 perf-profile.self.cycles-pp.syscall_return_via_sysret 16.95 -0.6 16.39 +0.0 16.96 perf-profile.self.cycles-pp.do_poll 6.55 -0.5 6.00 -0.0 6.52 perf-profile.self.cycles-pp.entry_SYSRETQ_unsafe_stack 4.93 ± 2% -0.5 4.46 ± 2% +0.3 5.26 perf-profile.self.cycles-pp.testcase 4.46 -0.4 4.06 +0.0 4.47 perf-profile.self.cycles-pp.entry_SYSCALL_64_after_hwframe 3.25 -0.3 2.92 +0.0 3.27 perf-profile.self.cycles-pp.entry_SYSCALL_64 1.78 ± 5% -0.2 1.54 ± 4% -0.1 1.70 perf-profile.self.cycles-pp.rep_movs_alternative 1.34 -0.2 1.18 -0.0 1.33 perf-profile.self.cycles-pp._copy_from_user 1.16 -0.1 1.02 ± 2% -0.0 1.15 perf-profile.self.cycles-pp.__kmalloc_noprof 0.96 -0.1 0.87 -0.0 0.93 ± 4% perf-profile.self.cycles-pp.kfree 0.68 -0.1 0.61 ± 2% -0.0 0.68 perf-profile.self.cycles-pp.__virt_addr_valid 0.56 -0.1 0.50 -0.0 0.55 perf-profile.self.cycles-pp.__check_heap_object 0.43 -0.0 0.39 -0.0 0.43 perf-profile.self.cycles-pp.__x64_sys_poll 0.49 -0.0 0.45 -0.0 0.49 perf-profile.self.cycles-pp.syscall_exit_to_user_mode 0.29 ± 2% -0.0 0.26 ± 3% +0.0 0.30 ± 2% perf-profile.self.cycles-pp.entry_SYSCALL_64_safe_stack 0.19 -0.0 0.17 ± 3% +0.0 0.19 ± 2% perf-profile.self.cycles-pp.check_stack_object 0.26 -0.0 0.25 ± 3% -0.0 0.26 ± 2% perf-profile.self.cycles-pp.check_heap_object 0.12 -0.0 0.11 ± 3% +0.0 0.12 perf-profile.self.cycles-pp.x64_sys_call 36.98 +4.6 41.62 -0.2 36.77 perf-profile.self.cycles-pp.fdget > > diff --git a/fs/select.c b/fs/select.c > index b41e2d651cc1..e0c816fa4ec4 100644 > --- a/fs/select.c > +++ b/fs/select.c > @@ -875,8 +875,6 @@ static inline __poll_t do_pollfd(struct pollfd *pollfd, poll_table *pwait, > fdput(f); > > out: > - /* ... and so does ->revents */ > - pollfd->revents = mangle_poll(mask); > return mask; > } > > @@ -909,6 +907,7 @@ static int do_poll(struct poll_list *list, struct poll_wqueues *wait, > pfd = walk->entries; > pfd_end = pfd + walk->len; > for (; pfd != pfd_end; pfd++) { > + __poll_t mask; > /* > * Fish for events. If we found one, record it > * and kill poll_table->_qproc, so we don't > @@ -916,8 +915,9 @@ static int do_poll(struct poll_list *list, struct poll_wqueues *wait, > * this. They'll get immediately deregistered > * when we break out and return. > */ > - if (do_pollfd(pfd, pt, &can_busy_loop, > - busy_flag)) { > + mask = do_pollfd(pfd, pt, &can_busy_loop, busy_flag); > + pfd->revents = mangle_poll(mask); > + if (mask) { > count++; > pt->_qproc = NULL; > /* found something, stop busy polling */ >