[viro-vfs:work.simple_recursive_removal] [rpc_pipe] 1d8f830b7f: fsmark.app_overhead 17.5% regression

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



hi, Al Viro,

besides below performance report, we noticed stderr
"A_dependency_job_for_nfs-idmapd.service_failed.See'journalctl-xe'for_details"
while testing which we didn't observe on parent.

we also noticed there are something like below in attached kmsg.xz:
kern  :warn  : [   61.938032] NFSD: Unable to initialize client recovery tracking! (-110)
kern  :info  : [   61.946579] NFSD: starting 90-second grace period (net f0000000)
kern  :warn  : [   63.146060] NFSD: Unable to initialize client recovery tracking! (-110)
kern  :info  : [   63.154531] NFSD: starting 90-second grace period (net f0000000)

which we didn't observed in parent tests, either.

we are not sure whether these impact the performance results, all these FYI.


Hello,

kernel test robot noticed a 17.5% regression of fsmark.app_overhead on:


commit: 1d8f830b7f27ba92a700a86988695e965f73b15a ("rpc_pipe: get rid of __rpc_lookup_create_exclusive()")
https://git.kernel.org/cgit/linux/kernel/git/viro/vfs.git work.simple_recursive_removal

testcase: fsmark
test machine: 48 threads 2 sockets Intel(R) Xeon(R) CPU E5-2697 v2 @ 2.70GHz (Ivy Bridge-EP) with 112G memory
parameters:

	iterations: 1x
	nr_threads: 32t
	disk: 1SSD
	fs: xfs
	fs2: nfsv4
	filesize: 8K
	test_size: 400M
	sync_method: fsyncBeforeClose
	nr_directories: 16d
	nr_files_per_directory: 256fpd
	cpufreq_governor: performance


In addition to that, the commit also has significant impact on the following tests:

+------------------+-------------------------------------------------------------------------------------------------+
| testcase: change | fsmark: fsmark.app_overhead 20.3% regression                                                    |
| test machine     | 48 threads 2 sockets Intel(R) Xeon(R) CPU E5-2697 v2 @ 2.70GHz (Ivy Bridge-EP) with 112G memory |
| test parameters  | cpufreq_governor=performance                                                                    |
|                  | disk=1SSD                                                                                       |
|                  | filesize=9B                                                                                     |
|                  | fs2=nfsv4                                                                                       |
|                  | fs=xfs                                                                                          |
|                  | iterations=1x                                                                                   |
|                  | nr_directories=16d                                                                              |
|                  | nr_files_per_directory=256fpd                                                                   |
|                  | nr_threads=32t                                                                                  |
|                  | sync_method=fsyncBeforeClose                                                                    |
|                  | test_size=400M                                                                                  |
+------------------+-------------------------------------------------------------------------------------------------+


If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <oliver.sang@xxxxxxxxx>
| Closes: https://lore.kernel.org/oe-lkp/202403011849.c1afb89c-oliver.sang@xxxxxxxxx


Details are as below:
-------------------------------------------------------------------------------------------------->


The kernel config and materials to reproduce are available at:
https://download.01.org/0day-ci/archive/20240301/202403011849.c1afb89c-oliver.sang@xxxxxxxxx

=========================================================================================
compiler/cpufreq_governor/disk/filesize/fs2/fs/iterations/kconfig/nr_directories/nr_files_per_directory/nr_threads/rootfs/sync_method/tbox_group/test_size/testcase:
  gcc-12/performance/1SSD/8K/nfsv4/xfs/1x/x86_64-rhel-8.3/16d/256fpd/32t/debian-12-x86_64-20240206.cgz/fsyncBeforeClose/lkp-ivb-2ep1/400M/fsmark

commit: 
  82a439e321 ("rpc_pipe: fold __rpc_mkpipe_dentry() into its sole caller")
  1d8f830b7f ("rpc_pipe: get rid of __rpc_lookup_create_exclusive()")

82a439e3213e7095 1d8f830b7f27ba92a700a869886 
---------------- --------------------------- 
       fail:runs  %reproduction    fail:runs
           |             |             |    
           :6          100%           6:6     stderr.A_dependency_job_for_nfs-idmapd.service_failed.See'journalctl-xe'for_details
           :6          100%           6:6     stderr.has_stderr
         %stddev     %change         %stddev
             \          |                \  
   2398327           +12.3%    2692538        cpuidle..usage
    107204          +140.2%     257509 ± 16%  meminfo.AnonHugePages
    304743            +9.2%     332637 ±  2%  vmstat.system.cs
    413300 ±  6%     +14.4%     472827 ±  3%  numa-numastat.node0.local_node
    445362 ±  5%     +12.9%     502812 ±  2%  numa-numastat.node0.numa_hit
    445128 ±  5%     +13.1%     503240 ±  2%  numa-vmstat.node0.numa_hit
    413066 ±  6%     +14.6%     473254 ±  3%  numa-vmstat.node0.numa_local
     66.15 ±  2%    +164.6%     175.04        uptime.boot
      2769 ±  2%    +188.3%       7983        uptime.idle
  65542525           +17.5%   77003387        fsmark.app_overhead
     55.33            +4.2%      57.67        fsmark.time.percent_of_cpu_this_job_got
    288191            +5.1%     302810        fsmark.time.voluntary_context_switches
     74.50            +2.9%      76.68        iostat.cpu.idle
     17.01           -10.9%      15.15 ±  2%  iostat.cpu.iowait
      2.59           -14.6%       2.21        iostat.cpu.user
     19.75            -2.3       17.47 ±  2%  mpstat.cpu.all.iowait%
      2.63            -0.4        2.23        mpstat.cpu.all.usr%
     12.34 ±  2%     -17.5%      10.18 ±  2%  mpstat.max_utilization_pct
     53999 ±  2%    +201.1%     162576        sched_debug.cpu_clk
     53400 ±  2%    +203.3%     161979 ±  2%  sched_debug.ktime
     54525 ±  2%    +199.2%     163131 ±  2%  sched_debug.sched_clk
     52.21          +140.6%     125.62 ± 16%  proc-vmstat.nr_anon_transparent_hugepages
      8420            -3.2%       8154        proc-vmstat.nr_mapped
     41883            +2.1%      42754        proc-vmstat.nr_slab_reclaimable
     31420            +8.4%      34074        proc-vmstat.nr_slab_unreclaimable
    704537            +7.9%     760346        proc-vmstat.numa_hit
    654821            +8.5%     710630        proc-vmstat.numa_local
   1039627            +6.1%    1103455        proc-vmstat.pgalloc_normal
    729116 ±  6%     +16.2%     846939        proc-vmstat.pgfree
      1.13 ±  3%      +9.8%       1.24 ±  2%  perf-stat.i.MPKI
 2.134e+09            -5.1%  2.026e+09        perf-stat.i.branch-instructions
      4.84            -0.2        4.63        perf-stat.i.branch-miss-rate%
  1.03e+08            -8.7%   94072597        perf-stat.i.branch-misses
      6.16 ±  2%      -0.3        5.86 ±  3%  perf-stat.i.cache-miss-rate%
  11302040 ±  2%      +6.8%   12067382 ±  3%  perf-stat.i.cache-misses
 2.199e+08            +9.6%  2.409e+08        perf-stat.i.cache-references
    360033           +14.8%     413385        perf-stat.i.context-switches
      2239 ±  4%     +13.0%       2530 ±  4%  perf-stat.i.cpu-migrations
      1472 ±  2%      -6.0%       1384 ±  2%  perf-stat.i.cycles-between-cache-misses
 1.039e+10            -5.7%  9.795e+09        perf-stat.i.instructions
      0.64            -5.0%       0.60        perf-stat.i.ipc
      7.68           +14.3%       8.78        perf-stat.i.metric.K/sec
      9527 ±  3%     -13.3%       8258 ±  7%  perf-stat.i.minor-faults
      9528 ±  3%     -13.3%       8258 ±  7%  perf-stat.i.page-faults
      1.09 ±  3%     +13.1%       1.23 ±  2%  perf-stat.overall.MPKI
      4.83            -0.2        4.65        perf-stat.overall.branch-miss-rate%
      1.59            +6.5%       1.70        perf-stat.overall.cpi
      1464 ±  2%      -5.9%       1378 ±  2%  perf-stat.overall.cycles-between-cache-misses
      0.63            -6.1%       0.59        perf-stat.overall.ipc
 1.955e+09            -5.0%  1.857e+09        perf-stat.ps.branch-instructions
  94404569            -8.6%   86246777        perf-stat.ps.branch-misses
  10360850 ±  2%      +6.8%   11061496 ±  3%  perf-stat.ps.cache-misses
 2.017e+08            +9.5%  2.209e+08        perf-stat.ps.cache-references
    330250           +14.8%     379036        perf-stat.ps.context-switches
      2054 ±  4%     +12.9%       2320 ±  4%  perf-stat.ps.cpu-migrations
 9.517e+09            -5.7%  8.979e+09        perf-stat.ps.instructions
      8697 ±  3%     -13.1%       7557 ±  7%  perf-stat.ps.minor-faults
      8697 ±  3%     -13.1%       7558 ±  7%  perf-stat.ps.page-faults
 1.151e+11            -6.0%  1.082e+11        perf-stat.total.instructions


***************************************************************************************************
lkp-ivb-2ep1: 48 threads 2 sockets Intel(R) Xeon(R) CPU E5-2697 v2 @ 2.70GHz (Ivy Bridge-EP) with 112G memory
=========================================================================================
compiler/cpufreq_governor/disk/filesize/fs2/fs/iterations/kconfig/nr_directories/nr_files_per_directory/nr_threads/rootfs/sync_method/tbox_group/test_size/testcase:
  gcc-12/performance/1SSD/9B/nfsv4/xfs/1x/x86_64-rhel-8.3/16d/256fpd/32t/debian-12-x86_64-20240206.cgz/fsyncBeforeClose/lkp-ivb-2ep1/400M/fsmark

commit: 
  82a439e321 ("rpc_pipe: fold __rpc_mkpipe_dentry() into its sole caller")
  1d8f830b7f ("rpc_pipe: get rid of __rpc_lookup_create_exclusive()")

82a439e3213e7095 1d8f830b7f27ba92a700a869886 
---------------- --------------------------- 
       fail:runs  %reproduction    fail:runs
           |             |             |    
           :6          100%           6:6     stderr.A_dependency_job_for_nfs-idmapd.service_failed.See'journalctl-xe'for_details
           :6          100%           6:6     stderr.has_stderr
         %stddev     %change         %stddev
             \          |                \  
   4652582           +12.3%    5226642        cpuidle..usage
      1.26 ± 74%      +2.3        3.56 ± 31%  perf-profile.calltrace.cycles-pp.mmap_region.do_mmap.vm_mmap_pgoff.ksys_mmap_pgoff.do_syscall_64
    188802 ±  2%     +63.3%     308290 ± 19%  meminfo.AnonHugePages
    140056           +10.9%     155322        meminfo.SUnreclaim
      5.50            +0.6        6.05 ±  2%  mpstat.cpu.all.sys%
     12.67           -16.1%      10.63        mpstat.max_utilization_pct
    655519 ±  2%     +14.2%     748457 ±  2%  numa-numastat.node0.local_node
    680860           +14.6%     779999 ±  2%  numa-numastat.node0.numa_hit
    681047           +14.6%     780819 ±  2%  numa-vmstat.node0.numa_hit
    655706 ±  2%     +14.3%     749281 ±  2%  numa-vmstat.node0.numa_local
     77.80 ±  3%    +133.7%     181.80        uptime.boot
      3182 ±  4%    +156.6%       8168        uptime.idle
    349730           +15.0%     402139        vmstat.system.cs
     71199            +5.9%      75404        vmstat.system.in
 1.219e+08           +20.3%  1.467e+08        fsmark.app_overhead
     55.83            +3.3%      57.67        fsmark.time.percent_of_cpu_this_job_got
    574683            +3.9%     597195        fsmark.time.voluntary_context_switches
     18.20            -5.0%      17.30        iostat.cpu.iowait
      6.17            +8.8%       6.72        iostat.cpu.system
      2.48           -11.1%       2.21        iostat.cpu.user
     56144 ±  4%    +184.2%     159591        sched_debug.cpu_clk
     55548 ±  4%    +186.2%     158995        sched_debug.ktime
     56677 ±  4%    +182.5%     160122        sched_debug.sched_clk
     91.96 ±  2%     +64.1%     150.93 ± 18%  proc-vmstat.nr_anon_transparent_hugepages
      8527            -4.7%       8128        proc-vmstat.nr_mapped
     35029           +10.9%      38832        proc-vmstat.nr_slab_unreclaimable
    980307           +11.0%    1087887        proc-vmstat.numa_hit
    930596           +11.4%    1036273        proc-vmstat.numa_local
   1446464            +7.9%    1560287        proc-vmstat.pgalloc_normal
      1.06 ±  2%      +8.4%       1.15 ±  3%  perf-stat.i.MPKI
      5.00            -0.2        4.83        perf-stat.i.branch-miss-rate%
 1.016e+08            -2.9%   98670000        perf-stat.i.branch-misses
  10378740 ±  2%      +9.1%   11323437 ±  3%  perf-stat.i.cache-misses
 2.348e+08            +9.3%  2.566e+08        perf-stat.i.cache-references
    396967           +13.9%     452135        perf-stat.i.context-switches
      1.70            +2.9%       1.74        perf-stat.i.cpi
 1.647e+10            +4.9%  1.728e+10        perf-stat.i.cpu-cycles
      2382           +13.4%       2702 ±  2%  perf-stat.i.cpu-migrations
      0.61            -4.0%       0.58        perf-stat.i.ipc
      8.36           +14.0%       9.52        perf-stat.i.metric.K/sec
      1.05 ±  3%      +9.2%       1.14 ±  3%  perf-stat.overall.MPKI
      4.98            -0.2        4.83        perf-stat.overall.branch-miss-rate%
      1.66            +5.1%       1.75        perf-stat.overall.cpi
      0.60            -4.8%       0.57        perf-stat.overall.ipc
  96867729            -2.8%   94186227        perf-stat.ps.branch-misses
   9895781 ±  2%      +9.2%   10807588 ±  3%  perf-stat.ps.cache-misses
 2.239e+08            +9.4%  2.449e+08        perf-stat.ps.cache-references
    378576           +14.0%     431601        perf-stat.ps.context-switches
 1.571e+10            +5.0%   1.65e+10        perf-stat.ps.cpu-cycles
      2271           +13.5%       2579 ±  2%  perf-stat.ps.cpu-migrations





Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.


-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki

Attachment: kmsg.xz
Description: application/xz


[Index of Archives]     [Linux Filesystem Development]     [Linux USB Development]     [Linux Media Development]     [Video for Linux]     [Linux NILFS]     [Linux Audio Users]     [Yosemite Info]     [Linux SCSI]

  Powered by Linux