[linux-next:master] [epoll] 900bbaae67: filebench.sum_operations/s 3.8% regression

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




Hello,

kernel test robot noticed a 3.8% regression of filebench.sum_operations/s on:


commit: 900bbaae67e980945dec74d36f8afe0de7556d5a ("epoll: Add synchronous wakeup support for ep_poll_callback")
https://git.kernel.org/cgit/linux/kernel/git/next/linux-next.git master

[test failed on linux-next/master 5b913f5d7d7fe0f567dea8605f21da6eaa1735fb]

testcase: filebench
config: x86_64-rhel-8.3
compiler: gcc-12
test machine: 128 threads 2 sockets Intel(R) Xeon(R) Platinum 8358 CPU @ 2.60GHz (Ice Lake) with 128G memory
parameters:

	disk: 1HDD
	fs: ext4
	fs2: cifs
	test: webproxy.f
	cpufreq_governor: performance




If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <oliver.sang@xxxxxxxxx>
| Closes: https://lore.kernel.org/oe-lkp/202411122121.de84272a-oliver.sang@xxxxxxxxx


Details are as below:
-------------------------------------------------------------------------------------------------->


The kernel config and materials to reproduce are available at:
https://download.01.org/0day-ci/archive/20241112/202411122121.de84272a-oliver.sang@xxxxxxxxx

=========================================================================================
compiler/cpufreq_governor/disk/fs2/fs/kconfig/rootfs/tbox_group/test/testcase:
  gcc-12/performance/1HDD/cifs/ext4/x86_64-rhel-8.3/debian-12-x86_64-20240206.cgz/lkp-icl-2sp6/webproxy.f/filebench

commit: 
  0dfcb72d33 ("coredump: add cond_resched() to dump_user_range")
  900bbaae67 ("epoll: Add synchronous wakeup support for ep_poll_callback")

0dfcb72d33c767bb 900bbaae67e980945dec74d36f8 
---------------- --------------------------- 
         %stddev     %change         %stddev
             \          |                \  
      0.03            +0.0        0.04        mpstat.cpu.all.irq%
      0.85            -0.1        0.76        mpstat.cpu.all.sys%
   2185818 ± 58%     -45.3%    1195059 ±104%  numa-meminfo.node1.FilePages
   1975339 ± 64%     -52.1%     946422 ±133%  numa-meminfo.node1.Unevictable
    364.50 ± 12%     +50.6%     549.00 ±  2%  perf-c2c.DRAM.remote
    208.17 ± 12%     +56.4%     325.67 ±  6%  perf-c2c.HITM.remote
      1002 ± 59%    +152.4%       2530 ± 32%  sched_debug.cpu.nr_switches.min
      8764 ±  3%     -25.7%       6515 ±  7%  sched_debug.cpu.nr_switches.stddev
     13791            -5.3%      13057        vmstat.system.cs
     11314            +4.2%      11784        vmstat.system.in
    546482 ± 58%     -45.3%     298775 ±104%  numa-vmstat.node1.nr_file_pages
    493834 ± 64%     -52.1%     236605 ±133%  numa-vmstat.node1.nr_unevictable
    493834 ± 64%     -52.1%     236605 ±133%  numa-vmstat.node1.nr_zone_unevictable
     13.58            -3.2%      13.15        filebench.sum_bytes_mb/s
    232514            -3.8%     223695        filebench.sum_operations
      3874            -3.8%       3727        filebench.sum_operations/s
      1019            -3.8%     980.50        filebench.sum_reads/s
     25.75            +3.9%      26.76        filebench.sum_time_ms/op
    203.83            -3.7%     196.33        filebench.sum_writes/s
    499886            -1.8%     490769        filebench.time.file_system_outputs
     17741 ±  2%      -3.9%      17040        filebench.time.minor_page_faults
     68.50           -14.6%      58.50        filebench.time.percent_of_cpu_this_job_got
    123.86           -14.9%     105.36        filebench.time.system_time
    350879            -2.8%     341014        filebench.time.voluntary_context_switches
     29557            -4.4%      28256        proc-vmstat.nr_active_anon
     16635 ±  3%      +4.3%      17352        proc-vmstat.nr_active_file
     37364            -3.8%      35926        proc-vmstat.nr_shmem
     29557            -4.4%      28256        proc-vmstat.nr_zone_active_anon
     16635 ±  3%      +4.3%      17352        proc-vmstat.nr_zone_active_file
     12281 ± 13%     +47.2%      18083 ± 15%  proc-vmstat.numa_hint_faults
    965.00 ±  6%     -30.8%     668.00 ± 20%  proc-vmstat.numa_huge_pte_updates
    518951 ±  6%     -28.2%     372754 ± 20%  proc-vmstat.numa_pte_updates
     73011            -1.1%      72183        proc-vmstat.pgactivate
    698445            +2.2%     713680        proc-vmstat.pgfault
     31722           +14.9%      36439 ±  3%  proc-vmstat.pgreuse
      1.12 ± 20%      -0.3        0.81 ±  8%  perf-profile.children.cycles-pp.__lookup_slow
      0.37 ± 26%      -0.2        0.20 ± 29%  perf-profile.children.cycles-pp.vma_alloc_folio_noprof
      0.39 ±  9%      -0.2        0.22 ± 56%  perf-profile.children.cycles-pp.__hrtimer_next_event_base
      0.18 ± 40%      -0.1        0.06 ± 73%  perf-profile.children.cycles-pp.__poll
      0.18 ± 40%      -0.1        0.06 ± 73%  perf-profile.children.cycles-pp.__x64_sys_poll
      0.18 ± 40%      -0.1        0.06 ± 73%  perf-profile.children.cycles-pp.do_sys_poll
      0.16 ± 45%      -0.1        0.05 ± 84%  perf-profile.children.cycles-pp.perf_evlist__poll_thread
      0.15 ± 33%      +0.1        0.25 ± 15%  perf-profile.children.cycles-pp.smp_call_function_many_cond
      0.03 ±100%      +0.1        0.14 ± 49%  perf-profile.children.cycles-pp.lockref_get_not_dead
      0.13 ± 47%      +0.1        0.28 ± 39%  perf-profile.children.cycles-pp.irq_work_tick
      0.49 ± 32%      +0.3        0.77 ± 20%  perf-profile.children.cycles-pp.__wait_for_common
      0.82 ± 20%      +0.4        1.22 ± 21%  perf-profile.children.cycles-pp.affine_move_task
      0.11 ± 37%      -0.1        0.04 ±112%  perf-profile.self.cycles-pp.task_contending
      0.03 ±100%      +0.1        0.14 ± 49%  perf-profile.self.cycles-pp.lockref_get_not_dead
 9.279e+08            -5.0%  8.816e+08        perf-stat.i.branch-instructions
      2.93            +0.0        2.98        perf-stat.i.branch-miss-rate%
  13227049            +4.3%   13791182        perf-stat.i.branch-misses
      2.99            +0.2        3.20 ±  2%  perf-stat.i.cache-miss-rate%
   1805840 ±  2%     +19.2%    2152127        perf-stat.i.cache-misses
  47931959            +6.2%   50910857        perf-stat.i.cache-references
     13706            -4.3%      13122        perf-stat.i.context-switches
 4.597e+09            -8.8%  4.192e+09        perf-stat.i.cpu-cycles
    338.72           +70.9%     578.79        perf-stat.i.cpu-migrations
      2345 ±  2%     -13.2%       2036 ±  3%  perf-stat.i.cycles-between-cache-misses
 4.233e+09            -4.7%  4.035e+09        perf-stat.i.instructions
      0.77            +1.8%       0.78        perf-stat.i.ipc
      2957 ±  2%      +4.2%       3081        perf-stat.i.minor-faults
      2957 ±  2%      +4.2%       3081        perf-stat.i.page-faults
      0.43 ±  2%     +25.0%       0.53        perf-stat.overall.MPKI
      1.42            +0.1        1.56        perf-stat.overall.branch-miss-rate%
      3.77 ±  2%      +0.5        4.23        perf-stat.overall.cache-miss-rate%
      1.09            -4.3%       1.04        perf-stat.overall.cpi
      2547 ±  2%     -23.5%       1948        perf-stat.overall.cycles-between-cache-misses
      0.92            +4.5%       0.96        perf-stat.overall.ipc
 9.229e+08            -5.0%   8.77e+08        perf-stat.ps.branch-instructions
  13149814            +4.3%   13712774        perf-stat.ps.branch-misses
   1795922 ±  2%     +19.2%    2140605        perf-stat.ps.cache-misses
  47675878            +6.2%   50645024        perf-stat.ps.cache-references
     13636            -4.3%      13055        perf-stat.ps.context-switches
 4.573e+09            -8.8%  4.171e+09        perf-stat.ps.cpu-cycles
    336.94           +70.9%     575.93        perf-stat.ps.cpu-migrations
  4.21e+09            -4.7%  4.014e+09        perf-stat.ps.instructions
      2934 ±  2%      +4.2%       3057        perf-stat.ps.minor-faults
      2934 ±  2%      +4.2%       3057        perf-stat.ps.page-faults
  7.63e+11            -4.2%  7.309e+11        perf-stat.total.instructions
      0.00 ±223%   +6816.7%       0.07 ± 34%  perf-sched.sch_delay.avg.ms.__cond_resched.stop_one_cpu.migrate_task_to.task_numa_migrate.isra
      0.03 ± 20%     +40.2%       0.04 ± 16%  perf-sched.sch_delay.avg.ms.schedule_hrtimeout_range_clock.do_poll.constprop.0.do_sys_poll
      0.03 ±  3%     +53.0%       0.05 ±  3%  perf-sched.sch_delay.avg.ms.schedule_hrtimeout_range_clock.ep_poll.do_epoll_wait.__x64_sys_epoll_wait
      0.08 ±  5%     -11.9%       0.07 ±  3%  perf-sched.sch_delay.avg.ms.schedule_preempt_disabled.kthread.ret_from_fork.ret_from_fork_asm
      0.05 ±  3%     +14.2%       0.05 ±  4%  perf-sched.sch_delay.avg.ms.schedule_preempt_disabled.rwsem_down_write_slowpath.down_write.open_last_lookups
      0.03           +26.5%       0.03 ±  2%  perf-sched.sch_delay.avg.ms.schedule_timeout.wait_woken.sk_wait_data.tcp_recvmsg_locked
      0.02 ±223%    +552.7%       0.10 ± 47%  perf-sched.sch_delay.max.ms.__cond_resched.cancel_work_sync._cifsFileInfo_put.cifs_close_deferred_file_under_dentry.cifs_unlink
      0.00 ±223%  +15516.7%       0.16 ± 30%  perf-sched.sch_delay.max.ms.__cond_resched.stop_one_cpu.migrate_task_to.task_numa_migrate.isra
      0.16 ±  6%     +16.5%       0.19 ±  5%  perf-sched.wait_and_delay.avg.ms.__cond_resched.cifs_demultiplex_thread.kthread.ret_from_fork.ret_from_fork_asm
     33.98 ± 11%     +28.0%      43.51 ±  5%  perf-sched.wait_and_delay.avg.ms.__cond_resched.process_one_work.worker_thread.kthread.ret_from_fork
      0.56           +13.7%       0.63        perf-sched.wait_and_delay.avg.ms.io_schedule.folio_wait_bit_common.folio_wait_writeback.__filemap_fdatawait_range
    392.79 ± 12%     +29.2%     507.64 ±  7%  perf-sched.wait_and_delay.avg.ms.schedule_hrtimeout_range_clock.do_poll.constprop.0.do_sys_poll
      1.02           +32.4%       1.35 ±  2%  perf-sched.wait_and_delay.avg.ms.schedule_hrtimeout_range_clock.ep_poll.do_epoll_wait.__x64_sys_epoll_wait
      0.24           +10.7%       0.27        perf-sched.wait_and_delay.avg.ms.schedule_timeout.wait_woken.sk_wait_data.tcp_recvmsg_locked
     99.17 ±  9%     -22.0%      77.33 ±  8%  perf-sched.wait_and_delay.count.__cond_resched.__kmalloc_noprof.cifs_strndup_to_utf16.cifs_convert_path_to_utf16.smb2_compound_op
     82.50 ± 21%     -48.7%      42.33 ± 16%  perf-sched.wait_and_delay.count.__cond_resched.cancel_work_sync._cifsFileInfo_put.process_one_work.worker_thread
    741.67 ±  4%      +8.5%     804.67 ±  5%  perf-sched.wait_and_delay.count.__cond_resched.cifs_demultiplex_thread.kthread.ret_from_fork.ret_from_fork_asm
      1228           -13.1%       1067 ±  5%  perf-sched.wait_and_delay.count.__cond_resched.process_one_work.worker_thread.kthread.ret_from_fork
      2421 ±  3%     -13.8%       2088 ±  3%  perf-sched.wait_and_delay.count.__lock_sock.lock_sock_nested.tcp_recvmsg.inet6_recvmsg
     41.50 ±  7%     -25.3%      31.00 ±  8%  perf-sched.wait_and_delay.count.schedule_hrtimeout_range_clock.do_poll.constprop.0.do_sys_poll
     10750           -24.7%       8094        perf-sched.wait_and_delay.count.schedule_hrtimeout_range_clock.ep_poll.do_epoll_wait.__x64_sys_epoll_wait
    279.23 ±  2%      +9.1%     304.73 ±  3%  perf-sched.wait_and_delay.max.ms.__cond_resched.__kmalloc_noprof.cifs_strndup_to_utf16.cifs_convert_path_to_utf16.smb2_compound_op
      1001          +111.3%       2115 ± 37%  perf-sched.wait_and_delay.max.ms.schedule_hrtimeout_range_clock.do_poll.constprop.0.do_sys_poll
    286.84 ±  4%     +10.7%     317.62 ±  8%  perf-sched.wait_and_delay.max.ms.schedule_preempt_disabled.rwsem_down_write_slowpath.down_write.do_unlinkat
    290.82 ±  3%     +11.8%     325.18 ±  8%  perf-sched.wait_and_delay.max.ms.wait_for_response.compound_send_recv.cifs_send_recv.SMB2_open
    291.61 ±  2%     +11.4%     324.95 ±  9%  perf-sched.wait_and_delay.max.ms.wait_for_response.compound_send_recv.smb2_compound_op.smb2_query_path_info
      0.13 ±  6%     +19.6%       0.15 ±  7%  perf-sched.wait_time.avg.ms.__cond_resched.cifs_demultiplex_thread.kthread.ret_from_fork.ret_from_fork_asm
     33.97 ± 11%     +27.9%      43.46 ±  5%  perf-sched.wait_time.avg.ms.__cond_resched.process_one_work.worker_thread.kthread.ret_from_fork
      0.01 ±223%  +12287.9%       0.68 ±114%  perf-sched.wait_time.avg.ms.__cond_resched.stop_one_cpu.migrate_task_to.task_numa_migrate.isra
      0.08 ±  4%      +9.3%       0.09 ±  4%  perf-sched.wait_time.avg.ms.__lock_sock.sk_wait_data.tcp_recvmsg_locked.tcp_recvmsg
      0.47           +15.3%       0.54        perf-sched.wait_time.avg.ms.io_schedule.folio_wait_bit_common.folio_wait_writeback.__filemap_fdatawait_range
    392.76 ± 12%     +29.2%     507.60 ±  7%  perf-sched.wait_time.avg.ms.schedule_hrtimeout_range_clock.do_poll.constprop.0.do_sys_poll
      0.99           +31.7%       1.30 ±  2%  perf-sched.wait_time.avg.ms.schedule_hrtimeout_range_clock.ep_poll.do_epoll_wait.__x64_sys_epoll_wait
      1.05 ±  3%     +14.2%       1.20 ±  6%  perf-sched.wait_time.avg.ms.schedule_preempt_disabled.__mutex_lock.constprop.0.cifs_call_async
    279.13 ±  2%      +9.1%     304.65 ±  3%  perf-sched.wait_time.max.ms.__cond_resched.__kmalloc_noprof.cifs_strndup_to_utf16.cifs_convert_path_to_utf16.smb2_compound_op
      0.01 ±223%  +35681.8%       1.97 ±121%  perf-sched.wait_time.max.ms.__cond_resched.stop_one_cpu.migrate_task_to.task_numa_migrate.isra
      1001          +111.3%       2115 ± 37%  perf-sched.wait_time.max.ms.schedule_hrtimeout_range_clock.do_poll.constprop.0.do_sys_poll
    286.74 ±  4%     +10.7%     317.50 ±  8%  perf-sched.wait_time.max.ms.schedule_preempt_disabled.rwsem_down_write_slowpath.down_write.do_unlinkat
    290.73 ±  3%     +11.8%     325.08 ±  8%  perf-sched.wait_time.max.ms.wait_for_response.compound_send_recv.cifs_send_recv.SMB2_open
    291.52 ±  2%     +11.4%     324.84 ±  9%  perf-sched.wait_time.max.ms.wait_for_response.compound_send_recv.smb2_compound_op.smb2_query_path_info




Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.


-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki





[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [NTFS 3]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [NTFS 3]     [Samba]     [Device Mapper]     [CEPH Development]

  Powered by Linux