[linux-next:master] [mm/slub] 5886fc82b6: will-it-scale.per_process_ops -3.7% regression

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




Hello,

kernel test robot noticed a -3.7% regression of will-it-scale.per_process_ops on:


commit: 5886fc82b6e3166dd1ba876809888fc39028d626 ("mm/slub: attempt to find layouts up to 1/2 waste in calculate_order()")
https://git.kernel.org/cgit/linux/kernel/git/next/linux-next.git master

testcase: will-it-scale
test machine: 224 threads 4 sockets Intel(R) Xeon(R) Platinum 8380H CPU @ 2.90GHz (Cooper Lake) with 192G memory
parameters:

	nr_task: 50%
	mode: process
	test: poll2
	cpufreq_governor: performance




If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <oliver.sang@xxxxxxxxx>
| Closes: https://lore.kernel.org/oe-lkp/202310202221.fdbcbe56-oliver.sang@xxxxxxxxx


Details are as below:
-------------------------------------------------------------------------------------------------->


The kernel config and materials to reproduce are available at:
https://download.01.org/0day-ci/archive/20231020/202310202221.fdbcbe56-oliver.sang@xxxxxxxxx

=========================================================================================
compiler/cpufreq_governor/kconfig/mode/nr_task/rootfs/tbox_group/test/testcase:
  gcc-12/performance/x86_64-rhel-8.3/process/50%/debian-11.1-x86_64-20220510.cgz/lkp-cpl-4sp2/poll2/will-it-scale

commit: 
  0fe2735d5e ("mm/slub: remove min_objects loop from calculate_order()")
  5886fc82b6 ("mm/slub: attempt to find layouts up to 1/2 waste in calculate_order()")

0fe2735d5e2e0060 5886fc82b6e3166dd1ba8768098 
---------------- --------------------------- 
         %stddev     %change         %stddev
             \          |                \  
     28.08            +1.1%      28.40        boot-time.dhcp
      6.17 ± 10%     -15.4%       5.22 ± 10%  perf-sched.wait_and_delay.avg.ms.__cond_resched.__wait_for_common.affine_move_task.__set_cpus_allowed_ptr.__sched_setaffinity
      6.17 ± 10%     -15.4%       5.22 ± 10%  perf-sched.wait_time.avg.ms.__cond_resched.__wait_for_common.affine_move_task.__set_cpus_allowed_ptr.__sched_setaffinity
  98376568            -3.7%   94713387        will-it-scale.112.processes
    878361            -3.7%     845654        will-it-scale.per_process_ops
  98376568            -3.7%   94713387        will-it-scale.workload
     81444            +4.8%      85370        proc-vmstat.nr_active_anon
     85071            +4.8%      89137        proc-vmstat.nr_shmem
     81444            +4.8%      85370        proc-vmstat.nr_zone_active_anon
     79205            +3.8%      82205        proc-vmstat.pgactivate
      5.18            -0.4        4.79 ±  2%  perf-profile.calltrace.cycles-pp.__fdget.do_poll.do_sys_poll.__x64_sys_poll.do_syscall_64
      2.18            -0.2        2.03 ±  2%  perf-profile.calltrace.cycles-pp.__check_object_size.do_sys_poll.__x64_sys_poll.do_syscall_64.entry_SYSCALL_64_after_hwframe
      2.29            -0.1        2.19        perf-profile.calltrace.cycles-pp.__entry_text_start.__poll
      0.83            -0.1        0.76 ±  3%  perf-profile.calltrace.cycles-pp.__check_heap_object.__check_object_size.do_sys_poll.__x64_sys_poll.do_syscall_64
      0.90            -0.1        0.84 ±  2%  perf-profile.calltrace.cycles-pp.check_heap_object.__check_object_size.do_sys_poll.__x64_sys_poll.do_syscall_64
      0.66 ±  2%      -0.1        0.61 ±  2%  perf-profile.calltrace.cycles-pp.__virt_addr_valid.check_heap_object.__check_object_size.do_sys_poll.__x64_sys_poll
      0.66            -0.0        0.61 ±  3%  perf-profile.calltrace.cycles-pp.kfree.do_sys_poll.__x64_sys_poll.do_syscall_64.entry_SYSCALL_64_after_hwframe
     47.75            +1.3       49.07        perf-profile.calltrace.cycles-pp.do_poll.do_sys_poll.__x64_sys_poll.do_syscall_64.entry_SYSCALL_64_after_hwframe
     22.63            +2.1       24.74        perf-profile.calltrace.cycles-pp.__fget_light.do_poll.do_sys_poll.__x64_sys_poll.do_syscall_64
      5.17            -0.4        4.78 ±  2%  perf-profile.children.cycles-pp.__fdget
      2.35            -0.2        2.18 ±  2%  perf-profile.children.cycles-pp.__check_object_size
      0.84            -0.1        0.77 ±  3%  perf-profile.children.cycles-pp.__check_heap_object
      1.48            -0.1        1.41        perf-profile.children.cycles-pp.__entry_text_start
      0.94            -0.1        0.87 ±  2%  perf-profile.children.cycles-pp.check_heap_object
      1.57            -0.1        1.51 ±  2%  perf-profile.children.cycles-pp.__kmalloc
      0.68 ±  2%      -0.1        0.63        perf-profile.children.cycles-pp.__virt_addr_valid
      0.66            -0.0        0.61 ±  3%  perf-profile.children.cycles-pp.kfree
      0.83            -0.0        0.79 ±  2%  perf-profile.children.cycles-pp.entry_SYSRETQ_unsafe_stack
     22.29            +1.7       24.01        perf-profile.children.cycles-pp.__fget_light
     48.12            +1.7       49.84        perf-profile.children.cycles-pp.do_poll
      7.66            -0.4        7.22        perf-profile.self.cycles-pp.do_sys_poll
      2.58 ±  2%      -0.2        2.38 ±  2%  perf-profile.self.cycles-pp.__fdget
      2.23            -0.1        2.12 ±  2%  perf-profile.self.cycles-pp._copy_from_user
      1.07 ±  3%      -0.1        0.98 ±  2%  perf-profile.self.cycles-pp.__poll
      0.84            -0.1        0.77 ±  2%  perf-profile.self.cycles-pp.__check_heap_object
      0.66 ±  2%      -0.1        0.61 ±  2%  perf-profile.self.cycles-pp.__virt_addr_valid
      0.65            -0.0        0.61 ±  3%  perf-profile.self.cycles-pp.kfree
      0.80            -0.0        0.76 ±  2%  perf-profile.self.cycles-pp.entry_SYSRETQ_unsafe_stack
      0.67 ±  2%      -0.0        0.64        perf-profile.self.cycles-pp.__entry_text_start
     19.62            +1.9       21.54        perf-profile.self.cycles-pp.__fget_light
 2.225e+11            -3.7%  2.143e+11        perf-stat.i.branch-instructions
 5.573e+08            -3.2%  5.393e+08        perf-stat.i.branch-misses
   2332742 ±  2%      -6.6%    2179079        perf-stat.i.cache-misses
  13799351            -3.9%   13256775        perf-stat.i.cache-references
      0.32            +5.0%       0.34        perf-stat.i.cpi
 3.863e+11            +1.2%  3.908e+11        perf-stat.i.cpu-cycles
    174616 ±  3%      +9.1%     190529 ±  2%  perf-stat.i.cycles-between-cache-misses
 2.777e+11            -3.7%  2.675e+11        perf-stat.i.dTLB-loads
 1.689e+11            -3.7%  1.627e+11        perf-stat.i.dTLB-stores
  50719249            -2.8%   49295350        perf-stat.i.iTLB-load-misses
   2674672           -14.5%    2285560        perf-stat.i.iTLB-loads
 1.206e+12            -3.7%  1.161e+12        perf-stat.i.instructions
      3.12            -4.8%       2.97        perf-stat.i.ipc
      1.24            -4.0%       1.19        perf-stat.i.metric.G/sec
      1.72            +1.1%       1.74        perf-stat.i.metric.GHz
     76.66            -5.6%      72.34        perf-stat.i.metric.K/sec
      1743            -3.5%       1683        perf-stat.i.metric.M/sec
    594324            -2.9%     576831        perf-stat.i.node-load-misses
      0.32            +5.0%       0.34        perf-stat.overall.cpi
    165074 ±  2%      +8.2%     178683        perf-stat.overall.cycles-between-cache-misses
      3.12            -4.8%       2.97        perf-stat.overall.ipc
 2.217e+11            -3.7%  2.135e+11        perf-stat.ps.branch-instructions
 5.554e+08            -3.2%  5.375e+08        perf-stat.ps.branch-misses
   2333651 ±  2%      -6.6%    2179985        perf-stat.ps.cache-misses
  13948192            -3.9%   13410551        perf-stat.ps.cache-references
 3.849e+11            +1.2%  3.894e+11        perf-stat.ps.cpu-cycles
 2.767e+11            -3.7%  2.665e+11        perf-stat.ps.dTLB-loads
 1.683e+11            -3.7%  1.621e+11        perf-stat.ps.dTLB-stores
  50558427            -2.8%   49131845        perf-stat.ps.iTLB-load-misses
   2664632           -14.5%    2276961 ±  2%  perf-stat.ps.iTLB-loads
 1.201e+12            -3.7%  1.157e+12        perf-stat.ps.instructions
    592459            -2.9%     575320        perf-stat.ps.node-load-misses
 3.621e+14            -3.6%  3.492e+14        perf-stat.total.instructions




Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.


-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki





[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux