Hello, kernel test robot noticed a 1.5% improvement of redis.get_total_throughput_rps on: commit: 4aecca4c76808f3736056d18ff510df80424bc9f ("net_tstamp: add SCM_TS_OPT_ID to provide OPT_ID in control message") https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master testcase: redis config: x86_64-rhel-9.4 compiler: gcc-12 test machine: 64 threads 2 sockets Intel(R) Xeon(R) Gold 6346 CPU @ 3.10GHz (Ice Lake) with 256G memory parameters: all: 1 sc_overcommit_memory: 1 sc_somaxconn: 65535 thp_enabled: never thp_defrag: never cluster: cs-localhost cpu_node_bind: even nr_processes: 4 test: set,get data_size: 1024 n_client: 5 requests: 68000000 n_pipeline: 3 key_len: 68000000 cpufreq_governor: performance Details are as below: --------------------------------------------------------------------------------------------------> The kernel config and materials to reproduce are available at: https://download.01.org/0day-ci/archive/20250205/202502051330.4d2f403b-lkp@xxxxxxxxx ========================================================================================= all/cluster/compiler/cpu_node_bind/cpufreq_governor/data_size/kconfig/key_len/n_client/n_pipeline/nr_processes/requests/rootfs/sc_overcommit_memory/sc_somaxconn/tbox_group/test/testcase/thp_defrag/thp_enabled: 1/cs-localhost/gcc-12/even/performance/1024/x86_64-rhel-9.4/68000000/5/3/4/68000000/debian-12-x86_64-20240206.cgz/1/65535/lkp-icl-2sp7/set,get/redis/never/never commit: 34ea1df802 ("Merge branch 'net-mlx5-hw-counters-refactor'") 4aecca4c76 ("net_tstamp: add SCM_TS_OPT_ID to provide OPT_ID in control message") 34ea1df802f79d44 4aecca4c76808f3736056d18ff5 ---------------- --------------------------- %stddev %change %stddev \ | \ 18491785 +2.1% 18880098 proc-vmstat.numa_hint_faults 18483590 +2.0% 18850441 proc-vmstat.numa_hint_faults_local 8589 ± 97% +255.7% 30553 ± 17% proc-vmstat.numa_pages_migrated 21039386 +2.2% 21505792 proc-vmstat.numa_pte_updates 8589 ± 97% +255.7% 30553 ± 17% proc-vmstat.pgmigrate_success 25696 ± 12% +14.4% 29397 proc-vmstat.pgreuse 252371 +1.5% 256108 redis.get_avg_throughput_rps 67.36 -1.5% 66.38 redis.get_avg_time_sec 1009486 +1.5% 1024432 redis.get_total_throughput_rps 269.45 -1.5% 265.52 redis.get_total_time_sec 257.67 -1.1% 254.83 redis.time.percent_of_cpu_this_job_got 337.27 -2.4% 329.05 redis.time.system_time 3.957e+09 +1.3% 4.008e+09 perf-stat.i.branch-instructions 38469227 +1.6% 39070923 perf-stat.i.branch-misses 32.20 +0.8 33.01 perf-stat.i.cache-miss-rate% 136208 +1.2% 137857 perf-stat.i.context-switches 1.34 -1.0% 1.32 perf-stat.i.cpi 1.948e+10 +1.3% 1.974e+10 perf-stat.i.instructions 9.12 +2.2% 9.32 perf-stat.i.metric.K/sec 224090 +2.5% 229667 perf-stat.i.minor-faults 224090 +2.5% 229667 perf-stat.i.page-faults 1.33 -34.1% 0.88 ± 70% perf-stat.overall.cpi 714.76 -33.9% 472.47 ± 70% perf-stat.overall.cycles-between-cache-misses 1.095e+08 -34.2% 72001076 ± 70% perf-stat.ps.cache-references 15.93 -0.8 15.15 perf-profile.calltrace.cycles-pp.tcp_write_xmit.__tcp_push_pending_frames.tcp_sendmsg_locked.tcp_sendmsg.sock_write_iter 15.95 -0.7 15.22 perf-profile.calltrace.cycles-pp.__tcp_push_pending_frames.tcp_sendmsg_locked.tcp_sendmsg.sock_write_iter.vfs_write 14.40 -0.7 13.72 perf-profile.calltrace.cycles-pp.tcp_write_xmit.__tcp_push_pending_frames.tcp_sendmsg_locked.tcp_sendmsg.__sys_sendto 14.43 -0.6 13.79 perf-profile.calltrace.cycles-pp.__tcp_push_pending_frames.tcp_sendmsg_locked.tcp_sendmsg.__sys_sendto.__x64_sys_sendto 21.35 -0.6 20.74 perf-profile.calltrace.cycles-pp.tcp_sendmsg_locked.tcp_sendmsg.sock_write_iter.vfs_write.ksys_write 21.50 -0.5 20.96 perf-profile.calltrace.cycles-pp.tcp_sendmsg.sock_write_iter.vfs_write.ksys_write.do_syscall_64 16.98 -0.5 16.44 perf-profile.calltrace.cycles-pp.tcp_sendmsg_locked.tcp_sendmsg.__sys_sendto.__x64_sys_sendto.do_syscall_64 17.15 -0.5 16.66 perf-profile.calltrace.cycles-pp.tcp_sendmsg.__sys_sendto.__x64_sys_sendto.do_syscall_64.entry_SYSCALL_64_after_hwframe 21.61 -0.5 21.14 perf-profile.calltrace.cycles-pp.sock_write_iter.vfs_write.ksys_write.do_syscall_64.entry_SYSCALL_64_after_hwframe 22.26 -0.4 21.84 perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.write 21.76 -0.4 21.34 perf-profile.calltrace.cycles-pp.vfs_write.ksys_write.do_syscall_64.entry_SYSCALL_64_after_hwframe.write 22.24 -0.4 21.82 perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.write 21.95 -0.4 21.53 perf-profile.calltrace.cycles-pp.ksys_write.do_syscall_64.entry_SYSCALL_64_after_hwframe.write 22.65 -0.4 22.24 perf-profile.calltrace.cycles-pp.write 17.28 -0.4 16.87 perf-profile.calltrace.cycles-pp.__sys_sendto.__x64_sys_sendto.do_syscall_64.entry_SYSCALL_64_after_hwframe.__send 17.30 -0.4 16.92 perf-profile.calltrace.cycles-pp.__x64_sys_sendto.do_syscall_64.entry_SYSCALL_64_after_hwframe.__send 17.49 -0.4 17.12 perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.__send 17.51 -0.4 17.14 perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.__send 17.92 -0.4 17.57 perf-profile.calltrace.cycles-pp.__send 0.57 +0.0 0.62 ± 3% perf-profile.calltrace.cycles-pp.skb_do_copy_data_nocache.tcp_sendmsg_locked.tcp_sendmsg.sock_write_iter.vfs_write 0.74 ± 2% +0.1 0.79 ± 3% perf-profile.calltrace.cycles-pp.tcp_stream_alloc_skb.tcp_sendmsg_locked.tcp_sendmsg.__sys_sendto.__x64_sys_sendto 1.34 +0.1 1.40 perf-profile.calltrace.cycles-pp.__inet_lookup_skb.tcp_v4_rcv.ip_protocol_deliver_rcu.ip_local_deliver_finish.__netif_receive_skb_one_core 3.85 +0.1 3.94 perf-profile.calltrace.cycles-pp.cpuidle_idle_call.do_idle.cpu_startup_entry.start_secondary.common_startup_64 1.87 ± 3% +0.1 1.97 perf-profile.calltrace.cycles-pp.intel_idle.cpuidle_enter_state.cpuidle_enter.cpuidle_idle_call.do_idle 5.14 +0.1 5.24 perf-profile.calltrace.cycles-pp.cpu_startup_entry.start_secondary.common_startup_64 5.14 +0.1 5.25 perf-profile.calltrace.cycles-pp.start_secondary.common_startup_64 5.54 +0.1 5.64 perf-profile.calltrace.cycles-pp.common_startup_64 5.13 +0.1 5.24 perf-profile.calltrace.cycles-pp.do_idle.cpu_startup_entry.start_secondary.common_startup_64 10.08 +0.1 10.20 perf-profile.calltrace.cycles-pp.do_epoll_ctl.__x64_sys_epoll_ctl.do_syscall_64.entry_SYSCALL_64_after_hwframe.epoll_ctl 10.48 +0.1 10.61 perf-profile.calltrace.cycles-pp.__x64_sys_epoll_ctl.do_syscall_64.entry_SYSCALL_64_after_hwframe.epoll_ctl 11.15 +0.1 11.28 perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.epoll_ctl 11.03 +0.1 11.18 perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.epoll_ctl 6.79 +0.2 6.95 perf-profile.calltrace.cycles-pp.dictFind 12.44 +0.2 12.62 perf-profile.calltrace.cycles-pp.epoll_ctl 15.91 +0.2 16.10 perf-profile.calltrace.cycles-pp.tcp_v4_rcv.ip_protocol_deliver_rcu.ip_local_deliver_finish.__netif_receive_skb_one_core.process_backlog 16.00 +0.2 16.21 perf-profile.calltrace.cycles-pp.ip_protocol_deliver_rcu.ip_local_deliver_finish.__netif_receive_skb_one_core.process_backlog.__napi_poll 16.03 +0.2 16.24 perf-profile.calltrace.cycles-pp.ip_local_deliver_finish.__netif_receive_skb_one_core.process_backlog.__napi_poll.net_rx_action 16.78 +0.2 17.01 perf-profile.calltrace.cycles-pp.process_backlog.__napi_poll.net_rx_action.handle_softirqs.do_softirq 16.54 +0.2 16.78 perf-profile.calltrace.cycles-pp.__netif_receive_skb_one_core.process_backlog.__napi_poll.net_rx_action.handle_softirqs 16.80 +0.2 17.04 perf-profile.calltrace.cycles-pp.__napi_poll.net_rx_action.handle_softirqs.do_softirq.__local_bh_enable_ip 20.70 +0.3 20.96 perf-profile.calltrace.cycles-pp.net_rx_action.handle_softirqs.do_softirq.__local_bh_enable_ip.__dev_queue_xmit 21.08 +0.3 21.34 perf-profile.calltrace.cycles-pp.handle_softirqs.do_softirq.__local_bh_enable_ip.__dev_queue_xmit.ip_finish_output2 21.15 +0.3 21.42 perf-profile.calltrace.cycles-pp.do_softirq.__local_bh_enable_ip.__dev_queue_xmit.ip_finish_output2.__ip_queue_xmit 21.23 +0.3 21.50 perf-profile.calltrace.cycles-pp.__local_bh_enable_ip.__dev_queue_xmit.ip_finish_output2.__ip_queue_xmit.__tcp_transmit_skb 22.73 +0.3 23.03 perf-profile.calltrace.cycles-pp.__dev_queue_xmit.ip_finish_output2.__ip_queue_xmit.__tcp_transmit_skb.tcp_write_xmit 23.44 +0.3 23.77 perf-profile.calltrace.cycles-pp.__ip_queue_xmit.__tcp_transmit_skb.tcp_write_xmit.__tcp_push_pending_frames.tcp_sendmsg_locked 22.94 +0.3 23.28 perf-profile.calltrace.cycles-pp.ip_finish_output2.__ip_queue_xmit.__tcp_transmit_skb.tcp_write_xmit.__tcp_push_pending_frames 26.32 +0.4 26.68 perf-profile.calltrace.cycles-pp.__tcp_transmit_skb.tcp_write_xmit.__tcp_push_pending_frames.tcp_sendmsg_locked.tcp_sendmsg 30.37 -1.5 28.92 perf-profile.children.cycles-pp.tcp_write_xmit 30.39 -1.4 29.03 perf-profile.children.cycles-pp.__tcp_push_pending_frames 38.37 -1.1 37.23 perf-profile.children.cycles-pp.tcp_sendmsg_locked 38.66 -1.0 37.67 perf-profile.children.cycles-pp.tcp_sendmsg 1.32 -0.9 0.38 ± 2% perf-profile.children.cycles-pp.tcp_event_new_data_sent 1.80 ± 2% -0.9 0.88 ± 2% perf-profile.children.cycles-pp.tcp_check_space 1.19 ± 2% -0.9 0.27 ± 4% perf-profile.children.cycles-pp.__mod_timer 1.22 ± 2% -0.9 0.30 ± 3% perf-profile.children.cycles-pp.sk_reset_timer 66.87 -0.5 66.34 perf-profile.children.cycles-pp.do_syscall_64 67.19 -0.5 66.67 perf-profile.children.cycles-pp.entry_SYSCALL_64_after_hwframe 21.61 -0.5 21.15 perf-profile.children.cycles-pp.sock_write_iter 21.82 -0.4 21.39 perf-profile.children.cycles-pp.vfs_write 22.02 -0.4 21.60 perf-profile.children.cycles-pp.ksys_write 17.29 -0.4 16.88 perf-profile.children.cycles-pp.__sys_sendto 22.78 -0.4 22.37 perf-profile.children.cycles-pp.write 17.31 -0.4 16.94 perf-profile.children.cycles-pp.__x64_sys_sendto 18.00 -0.4 17.65 perf-profile.children.cycles-pp.__send 0.23 ± 5% -0.1 0.16 ± 6% perf-profile.children.cycles-pp.tcp_event_data_recv 0.12 ± 4% +0.0 0.13 ± 3% perf-profile.children.cycles-pp.validate_xmit_skb 0.38 ± 2% +0.0 0.40 ± 3% perf-profile.children.cycles-pp.syscall_return_via_sysret 0.30 ± 4% +0.0 0.32 ± 3% perf-profile.children.cycles-pp.pick_next_task_fair 0.51 ± 2% +0.0 0.54 ± 2% perf-profile.children.cycles-pp._copy_from_iter 0.18 ± 5% +0.0 0.21 ± 2% perf-profile.children.cycles-pp.tcp_schedule_loss_probe 1.35 +0.1 1.40 perf-profile.children.cycles-pp.__inet_lookup_skb 1.49 +0.1 1.56 perf-profile.children.cycles-pp.tcp_stream_alloc_skb 0.76 +0.1 0.83 perf-profile.children.cycles-pp.skb_do_copy_data_nocache 4.02 +0.1 4.09 perf-profile.children.cycles-pp.cpuidle_enter 4.00 +0.1 4.07 perf-profile.children.cycles-pp.cpuidle_enter_state 0.24 ± 5% +0.1 0.32 ± 5% perf-profile.children.cycles-pp.release_sock 4.22 +0.1 4.32 perf-profile.children.cycles-pp.cpuidle_idle_call 1.93 ± 2% +0.1 2.03 perf-profile.children.cycles-pp.intel_idle 5.53 +0.1 5.63 perf-profile.children.cycles-pp.do_idle 5.14 +0.1 5.25 perf-profile.children.cycles-pp.start_secondary 5.54 +0.1 5.64 perf-profile.children.cycles-pp.common_startup_64 5.54 +0.1 5.64 perf-profile.children.cycles-pp.cpu_startup_entry 10.12 +0.1 10.24 perf-profile.children.cycles-pp.do_epoll_ctl 10.50 +0.1 10.63 perf-profile.children.cycles-pp.__x64_sys_epoll_ctl 12.76 +0.2 12.93 perf-profile.children.cycles-pp.epoll_ctl 6.88 +0.2 7.05 perf-profile.children.cycles-pp.dictFind 15.94 +0.2 16.13 perf-profile.children.cycles-pp.tcp_v4_rcv 16.04 +0.2 16.25 perf-profile.children.cycles-pp.ip_local_deliver_finish 16.02 +0.2 16.23 perf-profile.children.cycles-pp.ip_protocol_deliver_rcu 16.55 +0.2 16.78 perf-profile.children.cycles-pp.__netif_receive_skb_one_core 16.81 +0.2 17.05 perf-profile.children.cycles-pp.__napi_poll 16.78 +0.2 17.02 perf-profile.children.cycles-pp.process_backlog 21.54 +0.3 21.79 perf-profile.children.cycles-pp.handle_softirqs 20.72 +0.3 20.98 perf-profile.children.cycles-pp.net_rx_action 21.16 +0.3 21.43 perf-profile.children.cycles-pp.do_softirq 21.31 +0.3 21.60 perf-profile.children.cycles-pp.__local_bh_enable_ip 22.76 +0.3 23.06 perf-profile.children.cycles-pp.__dev_queue_xmit 22.96 +0.3 23.29 perf-profile.children.cycles-pp.ip_finish_output2 23.46 +0.3 23.80 perf-profile.children.cycles-pp.__ip_queue_xmit 26.38 +0.3 26.72 perf-profile.children.cycles-pp.__tcp_transmit_skb 1.13 ± 2% -1.0 0.18 ± 3% perf-profile.self.cycles-pp.__mod_timer 1.79 ± 2% -0.9 0.87 ± 2% perf-profile.self.cycles-pp.tcp_check_space 0.22 ± 6% -0.1 0.15 ± 7% perf-profile.self.cycles-pp.tcp_event_data_recv 0.48 +0.0 0.50 ± 2% perf-profile.self.cycles-pp.mod_objcg_state 0.32 ± 2% +0.0 0.34 perf-profile.self.cycles-pp.call 0.50 ± 2% +0.0 0.52 ± 2% perf-profile.self.cycles-pp._copy_from_iter 0.17 ± 4% +0.0 0.19 ± 3% perf-profile.self.cycles-pp.ip_finish_output2 0.11 ± 9% +0.0 0.14 ± 3% perf-profile.self.cycles-pp.tcp_event_new_data_sent 0.27 ± 4% +0.0 0.30 ± 3% perf-profile.self.cycles-pp.__alloc_skb 0.21 +0.0 0.24 ± 3% perf-profile.self.cycles-pp.kfree_skbmem 0.36 ± 3% +0.0 0.40 perf-profile.self.cycles-pp._raw_spin_lock_bh 0.56 ± 2% +0.0 0.60 ± 2% perf-profile.self.cycles-pp.kmem_cache_free 0.13 ± 5% +0.0 0.17 ± 5% perf-profile.self.cycles-pp.vfs_write 0.00 +0.1 0.06 ± 9% perf-profile.self.cycles-pp.__x64_sys_sendto 0.08 ± 8% +0.1 0.14 ± 4% perf-profile.self.cycles-pp.sock_write_iter 0.02 ± 99% +0.1 0.09 ± 11% perf-profile.self.cycles-pp.__sys_sendto 3.40 +0.1 3.50 perf-profile.self.cycles-pp.tcp_sendmsg_locked 1.93 ± 2% +0.1 2.03 perf-profile.self.cycles-pp.intel_idle 0.00 +0.1 0.11 ± 5% perf-profile.self.cycles-pp.__tcp_push_pending_frames 6.66 +0.1 6.78 perf-profile.self.cycles-pp.dictFind 0.00 +0.1 0.13 ± 2% perf-profile.self.cycles-pp.tcp_sendmsg Disclaimer: Results have been estimated based on internal Intel analysis and are provided for informational purposes only. Any difference in system hardware or software design or configuration may affect actual performance. -- 0-DAY CI Kernel Test Service https://github.com/intel/lkp-tests/wiki