Re: [lkp-robot] [MD] 0ffbb1adf8: aim7.jobs-per-min -10.6% regression

Xiao Ni <xni@xxxxxxxxxx> · Thu, 26 Apr 2018 23:11:45 -0400 (EDT)

----- Original Message -----
> From: "Ye Xiaolong" <xiaolong.ye@xxxxxxxxx>
> To: "Xiao Ni" <xni@xxxxxxxxxx>
> Cc: linux-raid@xxxxxxxxxxxxxxx, shli@xxxxxxxxxx, "ming lei" <ming.lei@xxxxxxxxxx>, ncroxon@xxxxxxxxxx,
> neilb@xxxxxxxx, lkp@xxxxxx
> Sent: Thursday, April 26, 2018 8:45:40 AM
> Subject: Re: [lkp-robot] [MD]  0ffbb1adf8:  aim7.jobs-per-min -10.6% regression
> 
> Hi, Xiao Ni
> 
> Sorry for the late response.
> 
> On 04/24, Xiao Ni wrote:
> >Hi all
> >
> >It's the first time I received such report. So I took some time reading
> >the manual of lkp and aim7. And I reserved one server with fedora and
> >did a test with the steps in this email. It failed like this:
> >
> >2018-04-25 02:43:19 echo "/fs/md0" > config
> >2018-04-25 02:43:19
> >	(
> >		echo storageqe-07.lab.bos.redhat.com
> >		echo sync_disk_rw
> >
> >		echo 1
> >		echo 600
> >		echo 2
> >		echo 600
> >		echo 1
> >	) | ./multitask -t
> >
> >AIM Multiuser Benchmark - Suite VII v1.1, January 22, 1996
> >Copyright (c) 1996 - 2001 Caldera International, Inc.
> >All Rights Reserved.
> >
> >Machine's name                                              : Machine's
> >configuration                                     : Number of iterations to
> >run [1 to 10]                       :
> >Information for iteration #1
> >Starting number of operation loads [1 to 10000]             : 1) Run to
> >crossover
> >2) Run to specific operation load           Enter [1 or 2]: Maximum number
> >of operation loads to simulate [600 to 10000]: Operation load increment [1
> >to 100]                         :
> >Using disk directory </fs/md0>
> >HZ is <100>
> >AIM Multiuser Benchmark - Suite VII Run Beginning
> >
> >Tasks    jobs/min  jti  jobs/min/task      real       cpu
> >  600/root/lkp-tests/bin/run-local:142:in `system': Interrupt
> >	from /root/lkp-tests/bin/run-local:142:in `<main>'
> >

Hi Xiaolong

> 
> Seems there are flaws in our reproduce script, we'll look into it.

Thanks for this. If there are some updates please let me know. I can
reproduce by myself.

> 
> >So now I can't understand these information from this report.
> >What does "-10.6% regression of aim7.jobs-per-min" mean? And
> >what's the usage of aim7.jobs-per-min? Could anyone help to
> >give some suggestions? What should I do to resolve such problem?
> >
> 
> Here "-10.6% regression of aim7.jobs-per-min" means the value of
> aim7.jobs-per-min
> in test for commit 0ffbb1adf8 is 10.6% less compared to its parent commit
> v4.16
> (0day bot captured your email patch and applied it on top of v4.16).
> 
> aim7.jobs-per-min was obtained through the raw output of aim7 test, such as
> below:
> 
> AIM Multiuser Benchmark - Suite VII v1.1, January 22, 1996
> Copyright (c) 1996 - 2001 Caldera International, Inc.
> All Rights Reserved.
> 
> Machine's name                                              : Machine's
> configuration                                     : Number of iterations to
> run [1 to 10]                       :
> Information for iteration #1
> Starting number of operation loads [1 to 10000]             : 1) Run to
> crossover
> 2) Run to specific operation load           Enter [1 or 2]: Maximum number of
> operation loads to simulate [600 to 10000]: Operation load increment [1 to
> 100]                         :
> Using disk directory </fs/md0>
> HZ is <100>
> AIM Multiuser Benchmark - Suite VII Run Beginning
> 
> Tasks    jobs/min  jti  jobs/min/task      real       cpu
>   600     1466.27   99         2.4438   2455.21  92829.76   Fri Apr 20
>   10:28:19 2018

So it's the result of jobs/min 1466.27 that lkp uses for aim7.jobs-per-min,
right?

> 
> AIM Multiuser Benchmark - Suite VII
>    Testing over

Now I can run aim7 tests as this too. I'll try to compare the jobs/min
during the test. 

Best Regards
Xiao
> 
> 
> aim7.jobs-per-min is the main kpi for aim7 tests, other numbers listed in
> comparison
> are less important, they are collected through multiple monitors (vmstat,
> mpstat) running in the background.
> We hope they can help you evaluate your patch in a complete way.
> 
> Thanks,
> Xiaolong
> 
> 
> >Best Regards
> >Xiao
> >
> >----- Original Message -----
> >> From: "kernel test robot" <xiaolong.ye@xxxxxxxxx>
> >> To: "Xiao Ni" <xni@xxxxxxxxxx>
> >> Cc: linux-raid@xxxxxxxxxxxxxxx, shli@xxxxxxxxxx, "ming lei"
> >> <ming.lei@xxxxxxxxxx>, ncroxon@xxxxxxxxxx,
> >> neilb@xxxxxxxx, lkp@xxxxxx
> >> Sent: Monday, April 23, 2018 8:41:43 AM
> >> Subject: [lkp-robot] [MD]  0ffbb1adf8:  aim7.jobs-per-min -10.6%
> >> regression
> >> 
> >> 
> >> Greeting,
> >> 
> >> FYI, we noticed a -10.6% regression of aim7.jobs-per-min due to commit:
> >> 
> >> 
> >> commit: 0ffbb1adf8b448568b44fe44c5fcdcf485040365 ("MD: fix lock contention
> >> for flush bios")
> >> url:
> >> https://github.com/0day-ci/linux/commits/Xiao-Ni/MD-fix-lock-contention-for-flush-bios/20180411-040300
> >> 
> >> 
> >> in testcase: aim7
> >> on test machine: 40 threads Intel(R) Xeon(R) CPU E5-2690 v2 @ 3.00GHz with
> >> 384G memory
> >> with following parameters:
> >> 
> >> 	disk: 4BRD_12G
> >> 	md: RAID1
> >> 	fs: xfs
> >> 	test: sync_disk_rw
> >> 	load: 600
> >> 	cpufreq_governor: performance
> >> 
> >> test-description: AIM7 is a traditional UNIX system level benchmark suite
> >> which is used to test and measure the performance of multiuser system.
> >> test-url: https://sourceforge.net/projects/aimbench/files/aim-suite7/
> >> 
> >> 
> >> 
> >> Details are as below:
> >> -------------------------------------------------------------------------------------------------->
> >> 
> >> 
> >> To reproduce:
> >> 
> >>         git clone https://github.com/intel/lkp-tests.git
> >>         cd lkp-tests
> >>         bin/lkp install job.yaml  # job file is attached in this email
> >>         bin/lkp run     job.yaml
> >> 
> >> =========================================================================================
> >> compiler/cpufreq_governor/disk/fs/kconfig/load/md/rootfs/tbox_group/test/testcase:
> >>   gcc-7/performance/4BRD_12G/xfs/x86_64-rhel-7.2/600/RAID1/debian-x86_64-2016-08-31.cgz/lkp-ivb-ep01/sync_disk_rw/aim7
> >> 
> >> commit:
> >>   v4.16
> >>   0ffbb1adf8 ("MD: fix lock contention for flush bios")
> >> 
> >>            v4.16 0ffbb1adf8b448568b44fe44c5
> >> ---------------- --------------------------
> >>          %stddev     %change         %stddev
> >>              \          |                \
> >>       1632 ±  2%     -10.6%       1458        aim7.jobs-per-min
> >>       2207 ±  2%     +11.8%       2468        aim7.time.elapsed_time
> >>       2207 ±  2%     +11.8%       2468        aim7.time.elapsed_time.max
> >>   51186515           -51.5%   24800655
> >>   aim7.time.involuntary_context_switches
> >>     146259 ±  8%     +31.7%     192669 ±  2%  aim7.time.minor_page_faults
> >>      80457 ±  2%     +15.9%      93267        aim7.time.system_time
> >>      50.25 ±  2%     +11.8%      56.17        aim7.time.user_time
> >>  7.257e+08 ±  2%      +7.3%  7.787e+08
> >>  aim7.time.voluntary_context_switches
> >>     520491 ± 61%     +53.8%     800775
> >>     interrupts.CAL:Function_call_interrupts
> >>       2463 ± 18%     +31.7%       3246 ± 15%  numa-vmstat.node0.nr_mapped
> >>       4.06 ±  2%      -0.5        3.51        mpstat.cpu.idle%
> >>       0.24 ±  6%      -0.1        0.15        mpstat.cpu.iowait%
> >>    8829533           +16.2%   10256984        softirqs.SCHED
> >>   33149109 ±  2%     +12.3%   37229216        softirqs.TIMER
> >>    4724795 ± 33%     +38.5%    6544104        cpuidle.C1E.usage
> >>  7.151e+08 ± 40%     -37.8%  4.449e+08        cpuidle.C6.time
> >>    3881055 ±122%     -85.7%     553608 ±  2%  cpuidle.C6.usage
> >>      61107 ±  2%     -10.7%      54566        vmstat.io.bo
> >>       2.60 ± 18%     -51.9%       1.25 ± 34%  vmstat.procs.b
> >>     305.10           -16.1%     256.00        vmstat.procs.r
> >>     404271           -11.3%     358644        vmstat.system.cs
> >>     167773           -16.5%     140121        vmstat.system.in
> >>     115358 ±  9%     +39.9%     161430 ±  3%  proc-vmstat.numa_hint_faults
> >>      62520 ± 10%     +47.6%      92267 ±  4%
> >>      proc-vmstat.numa_hint_faults_local
> >>      20893 ± 10%     +29.0%      26948 ±  2%
> >>      proc-vmstat.numa_pages_migrated
> >>     116983 ±  9%     +39.5%     163161 ±  3%  proc-vmstat.numa_pte_updates
> >>    5504935 ±  3%     +12.3%    6179561        proc-vmstat.pgfault
> >>      20893 ± 10%     +29.0%      26948 ±  2%
> >>      proc-vmstat.pgmigrate_success
> >>       2.68 ±  3%      -0.2        2.44        turbostat.C1%
> >>    4724733 ± 33%     +38.5%    6544028        turbostat.C1E
> >>    3879529 ±122%     -85.8%     552056 ±  2%  turbostat.C6
> >>       0.82 ± 43%      -0.4        0.45        turbostat.C6%
> >>       3.62 ±  2%     -15.1%       3.08        turbostat.CPU%c1
> >>     176728 ±  2%     +13.3%     200310        turbostat.SMI
> >>  9.893e+12 ± 65%     +57.6%  1.559e+13
> >>  perf-stat.branch-instructions
> >>  3.022e+10 ± 65%     +46.6%   4.43e+10        perf-stat.branch-misses
> >>      11.31 ± 65%      +5.5       16.78        perf-stat.cache-miss-rate%
> >>  2.821e+10 ± 65%     +50.4%  4.243e+10        perf-stat.cache-misses
> >>  1.796e+14 ± 65%     +58.4%  2.845e+14        perf-stat.cpu-cycles
> >>   26084177 ± 65%    +176.3%   72073346        perf-stat.cpu-migrations
> >>  1.015e+13 ± 65%     +57.3%  1.597e+13        perf-stat.dTLB-loads
> >>  1.125e+12 ± 65%     +45.6%  1.638e+12        perf-stat.dTLB-stores
> >>  4.048e+13 ± 65%     +57.3%  6.367e+13        perf-stat.instructions
> >>       4910 ± 65%     +57.5%       7734
> >>       perf-stat.instructions-per-iTLB-miss
> >>    3847673 ± 65%     +57.4%    6057388        perf-stat.minor-faults
> >>  1.403e+10 ± 65%     +51.8%   2.13e+10        perf-stat.node-load-misses
> >>  1.557e+10 ± 65%     +51.0%  2.351e+10        perf-stat.node-loads
> >>      27.41 ± 65%     +12.1       39.52
> >>      perf-stat.node-store-miss-rate%
> >>  7.828e+09 ± 65%     +53.2%  1.199e+10        perf-stat.node-store-misses
> >>  1.216e+10 ± 65%     +50.9%  1.835e+10        perf-stat.node-stores
> >>    3847675 ± 65%     +57.4%    6057392        perf-stat.page-faults
> >>    1041337 ±  2%     +12.0%    1166774
> >>    sched_debug.cfs_rq:/.exec_clock.avg
> >>    1045329 ±  2%     +11.8%    1168250
> >>    sched_debug.cfs_rq:/.exec_clock.max
> >>    1037380 ±  2%     +12.3%    1165349
> >>    sched_debug.cfs_rq:/.exec_clock.min
> >>       3670 ± 16%     -70.1%       1098 ± 50%
> >>       sched_debug.cfs_rq:/.exec_clock.stddev
> >>     234.08 ±  3%     -22.4%     181.73
> >>     sched_debug.cfs_rq:/.load_avg.avg
> >>      33.65 ±  2%     -59.3%      13.70 ±  3%
> >>      sched_debug.cfs_rq:/.load_avg.min
> >>   36305683 ±  2%     +12.4%   40814603
> >>   sched_debug.cfs_rq:/.min_vruntime.avg
> >>   37771587 ±  2%     +11.1%   41960294
> >>   sched_debug.cfs_rq:/.min_vruntime.max
> >>   34884765 ±  2%     +13.6%   39635549
> >>   sched_debug.cfs_rq:/.min_vruntime.min
> >>    1277146 ±  9%     -21.8%     998594 ±  4%
> >>    sched_debug.cfs_rq:/.min_vruntime.stddev
> >>       1.99 ±  5%     -16.8%       1.65 ±  3%
> >>       sched_debug.cfs_rq:/.nr_running.max
> >>      19.18 ±  5%     +20.3%      23.08
> >>      sched_debug.cfs_rq:/.nr_spread_over.avg
> >>       8.47 ± 18%     +23.8%      10.49 ± 12%
> >>       sched_debug.cfs_rq:/.nr_spread_over.min
> >>      22.06 ±  7%     -18.2%      18.05 ±  6%
> >>      sched_debug.cfs_rq:/.runnable_load_avg.avg
> >>       0.08 ± 38%     -50.5%       0.04 ± 20%
> >>       sched_debug.cfs_rq:/.spread.avg
> >>    1277131 ±  9%     -21.8%     998584 ±  4%
> >>    sched_debug.cfs_rq:/.spread0.stddev
> >>     177058 ±  7%     -23.5%     135429        sched_debug.cpu.avg_idle.max
> >>      33859 ±  6%     -24.6%      25519
> >>      sched_debug.cpu.avg_idle.stddev
> >>    1119705 ±  2%     +11.0%    1242378        sched_debug.cpu.clock.avg
> >>    1119726 ±  2%     +11.0%    1242399        sched_debug.cpu.clock.max
> >>    1119680 ±  2%     +11.0%    1242353        sched_debug.cpu.clock.min
> >>    1119705 ±  2%     +11.0%    1242378
> >>    sched_debug.cpu.clock_task.avg
> >>    1119726 ±  2%     +11.0%    1242399
> >>    sched_debug.cpu.clock_task.max
> >>    1119680 ±  2%     +11.0%    1242353
> >>    sched_debug.cpu.clock_task.min
> >>       3.47 ± 11%     +28.4%       4.45 ±  2%
> >>       sched_debug.cpu.cpu_load[1].min
> >>      28.92 ±  4%      -7.8%      26.66 ±  3%
> >>      sched_debug.cpu.cpu_load[2].avg
> >>       5.65 ±  7%     +37.9%       7.80 ±  8%
> >>       sched_debug.cpu.cpu_load[2].min
> >>      29.97 ±  3%      -7.9%      27.60 ±  3%
> >>      sched_debug.cpu.cpu_load[3].avg
> >>       8.22 ±  4%     +31.9%      10.83 ±  7%
> >>       sched_debug.cpu.cpu_load[3].min
> >>      10.50 ±  5%     +24.0%      13.02 ±  6%
> >>      sched_debug.cpu.cpu_load[4].min
> >>       2596 ±  4%      -9.2%       2358 ±  2%
> >>       sched_debug.cpu.curr->pid.avg
> >>       4463 ±  4%     -13.5%       3862 ±  3%
> >>       sched_debug.cpu.curr->pid.stddev
> >>    1139237 ±  2%     +11.5%    1269690
> >>    sched_debug.cpu.nr_load_updates.avg
> >>    1146166 ±  2%     +11.2%    1274264
> >>    sched_debug.cpu.nr_load_updates.max
> >>    1133061 ±  2%     +11.4%    1262298
> >>    sched_debug.cpu.nr_load_updates.min
> >>       3943 ±  9%     -32.4%       2666 ± 23%
> >>       sched_debug.cpu.nr_load_updates.stddev
> >>       8814 ±  7%     +17.8%      10386 ±  2%
> >>       sched_debug.cpu.nr_uninterruptible.max
> >>      -3613           +32.3%      -4782
> >>      sched_debug.cpu.nr_uninterruptible.min
> >>       2999 ±  4%     +13.1%       3391
> >>       sched_debug.cpu.nr_uninterruptible.stddev
> >>      42794 ± 15%     -76.8%       9921 ± 90%
> >>      sched_debug.cpu.sched_goidle.stddev
> >>     652177           +14.4%     745857
> >>     sched_debug.cpu.ttwu_local.avg
> >>     684397           +17.7%     805440 ±  2%
> >>     sched_debug.cpu.ttwu_local.max
> >>     622628           +12.0%     697353 ±  2%
> >>     sched_debug.cpu.ttwu_local.min
> >>      16189 ± 33%    +128.0%      36916 ± 47%
> >>      sched_debug.cpu.ttwu_local.stddev
> >>    1119677 ±  2%     +11.0%    1242351        sched_debug.cpu_clk
> >>    1119677 ±  2%     +11.0%    1242351        sched_debug.ktime
> >>    1120113 ±  2%     +11.0%    1242771        sched_debug.sched_clk
> >>                                                                                 
> >>                                                                                                                                                                 
> >>                                  aim7.jobs-per-min
> >>                                                                                 
> >>   1750
> >>   +-+------------------------------------------------------------------+
> >>        |                                                                    |   
> >>   1700 +-+                         +             .+.+                .+.+
> >>   |
> >>   1650 +-+                        + +          .+    :             .+    :
> >>   |
> >>        |  .+.+.+.    +      .+.+.+   +.       +      :    +.      +      :
> >>        |  |
> >>   1600 +-+       +  : + .+.+           +..+. +        +. +  +.+. +
> >>   +.|
> >>        |          + :  +                    +           +       +
> >>        |          |
> >>   1550 +-+         +
> >>   |
> >>        |                                                                    |   
> >>   1500 +-+
> >>   |
> >>   1450 +-O     O     O               O O      O O
> >>   |
> >>        O   O O     O   O     O O O O      O O     O O
> >>        |
> >>   1400 +-+       O       O O
> >>   |
> >>        |                                                                    |   
> >>   1350
> >>   +-+------------------------------------------------------------------+
> >>                                                                                 
> >>                                                                                 
> >> [*] bisect-good sample
> >> [O] bisect-bad  sample
> >> 
> >> 
> >> 
> >> Disclaimer:
> >> Results have been estimated based on internal Intel analysis and are
> >> provided
> >> for informational purposes only. Any difference in system hardware or
> >> software
> >> design or configuration may affect actual performance.
> >> 
> >> 
> >> Thanks,
> >> Xiaolong
> >> 
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html