Re: [LKP] Re: [blk] 8c5035dfbb: fio.read_iops -10.6% regression

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

在 2022/10/09 13:47, Yin Fengwei 写道:
Hi Kuai,

On 10/8/22 16:00, Yu Kuai wrote:
Hi,

在 2022/10/08 10:50, kernel test robot 写道:
Greeting,

FYI, we noticed a -10.6% regression of fio.read_iops due to commit:

I don't know how this is working but I'm *sure* this commit won't affect
performance. Please take a look at the commit, only wbt initialization
is touched, which is done while creating the device:

device_add_disk
  blk_register_queue
   wbt_enable_default
    wbt_init

And io path is the same with or without this commit.

By the way, wbt should only work for write.
Some information here:
It looks like the line
     wbt_set_write_cache(q, test_bit(QUEUE_FLAG_WC, &q->queue_flags));
matters.

If move only this line to original position based on 8c5035dfbb,
the regression is gone.

If move only this line before ret = rq_qos_add() (just like your patch
did, but only with this line) based on 8c5035dfbb, the regression can
be reproduced.


Thanks for the information, but I still don't understand if there is any
difference after wbt_init() is done, and how does read is afftected by
wbt. 🙁

Regards
Yin, Fengwei


Thanks,
Kuai

commit: 8c5035dfbb9475b67c82b3fdb7351236525bf52b ("blk-wbt: call rq_qos_add() after wb_normal is initialized")
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master

in testcase: fio-basic
on test machine: 192 threads 4 sockets Intel(R) Xeon(R) Platinum 9242 CPU @ 2.30GHz (Cascade Lake) with 192G memory
with following parameters:

     runtime: 300s
     nr_task: 8t
     disk: 1SSD
     fs: btrfs
     rw: randread
     bs: 2M
     ioengine: sync
     test_size: 256g
     cpufreq_governor: performance

test-description: Fio is a tool that will spawn a number of threads or processes doing a particular type of I/O action as specified by the user.
test-url: https://github.com/axboe/fio


Details are as below:

=========================================================================================
bs/compiler/cpufreq_governor/disk/fs/ioengine/kconfig/nr_task/rootfs/runtime/rw/tbox_group/test_size/testcase:
    2M/gcc-11/performance/1SSD/btrfs/sync/x86_64-rhel-8.3/8t/debian-11.1-x86_64-20220510.cgz/300s/randread/lkp-csl-2ap4/256g/fio-basic

commit:
    f7de4886fe ("rnbd-srv: remove struct rnbd_dev")
    8c5035dfbb ("blk-wbt: call rq_qos_add() after wb_normal is initialized")

f7de4886fe8f008a 8c5035dfbb9475b67c82b3fdb73
---------------- ---------------------------
           %stddev     %change         %stddev
               \          |                \
        0.03 ±106%      +0.2        0.22 ± 80%  fio.latency_20ms%
        0.02 ± 33%      -0.0        0.01 ± 12%  fio.latency_4ms%
        2508           -10.6%       2243        fio.read_bw_MBps
     6717440           +17.6%    7897088        fio.read_clat_90%_us
     6892202           +19.0%    8202922        fio.read_clat_95%_us
     7602176 ±  4%     +18.4%    9000277 ±  3%  fio.read_clat_99%_us
     6374238           +11.8%    7127450        fio.read_clat_mean_us
      363825 ± 10%     +74.9%     636378 ±  5%  fio.read_clat_stddev
        1254           -10.6%       1121        fio.read_iops
      104.97           +11.8%     117.32        fio.time.elapsed_time
      104.97           +11.8%     117.32        fio.time.elapsed_time.max
       13731            +5.6%      14498 ±  4%  fio.time.maximum_resident_set_size
      116.00            -8.2%     106.50        fio.time.percent_of_cpu_this_job_got
   1.998e+10           +11.4%  2.226e+10        cpuidle..time
        3.27 ±  3%      +4.6%       3.42        iostat.cpu.iowait
        4.49 ± 68%      -2.1        2.38 ±152%  perf-profile.children.cycles-pp.number
        4.49 ± 68%      -2.5        1.98 ±175%  perf-profile.self.cycles-pp.number
      557763            +5.4%     587781        proc-vmstat.pgfault
       25488            +3.1%      26274        proc-vmstat.pgreuse
     2459048           -10.1%    2209482        vmstat.io.bi
      184649 ±  5%     -10.4%     165526 ±  7%  vmstat.system.cs
      111733 ± 30%     +61.8%     180770 ± 21%  numa-meminfo.node0.AnonPages
      113221 ± 30%     +60.2%     181416 ± 21%  numa-meminfo.node0.Inactive(anon)
       11301 ± 24%    +164.5%      29888 ±117%  numa-meminfo.node2.Active(file)
      104911 ± 39%     -80.5%      20456 ±100%  numa-meminfo.node3.AnonHugePages
      131666 ± 27%     -67.9%      42297 ± 82%  numa-meminfo.node3.AnonPages
      132698 ± 26%     -67.5%      43158 ± 81%  numa-meminfo.node3.Inactive(anon)
       27934 ± 30%     +61.8%      45196 ± 21%  numa-vmstat.node0.nr_anon_pages
       28306 ± 30%     +60.2%      45358 ± 21%  numa-vmstat.node0.nr_inactive_anon
       28305 ± 30%     +60.2%      45357 ± 21%  numa-vmstat.node0.nr_zone_inactive_anon
        6291 ± 24%     +68.0%      10567 ± 26%  numa-vmstat.node2.workingset_nodes
       32925 ± 27%     -67.9%      10571 ± 82%  numa-vmstat.node3.nr_anon_pages
       33182 ± 26%     -67.5%      10786 ± 81%  numa-vmstat.node3.nr_inactive_anon
       33182 ± 26%     -67.5%      10786 ± 81%  numa-vmstat.node3.nr_zone_inactive_anon
      161.78 ±  4%     -28.2%     116.10 ± 30%  sched_debug.cfs_rq:/.runnable_avg.avg
      161.46 ±  4%     -28.2%     115.85 ± 30%  sched_debug.cfs_rq:/.util_avg.avg
      426382           +11.0%     473345 ±  6%  sched_debug.cpu.clock.avg
      426394           +11.0%     473357 ±  6%  sched_debug.cpu.clock.max
      426370           +11.0%     473331 ±  6%  sched_debug.cpu.clock.min
      426139           +10.9%     472586 ±  6%  sched_debug.cpu.clock_task.avg
      426368           +11.0%     473130 ±  6%  sched_debug.cpu.clock_task.max
      416196           +11.1%     462228 ±  6%  sched_debug.cpu.clock_task.min
        1156 ±  7%     -10.8%       1031 ±  6%  sched_debug.cpu.curr->pid.stddev
      426372           +11.0%     473334 ±  6%  sched_debug.cpu_clk
      425355           +11.0%     472318 ±  6%  sched_debug.ktime
      426826           +11.0%     473787 ±  6%  sched_debug.sched_clk
   1.263e+09            -7.9%  1.164e+09 ±  3%  perf-stat.i.branch-instructions
      190886 ±  5%     -10.8%     170290 ±  7%  perf-stat.i.context-switches
   1.979e+09            -8.8%  1.804e+09 ±  2%  perf-stat.i.dTLB-loads
   8.998e+08            -8.2%  8.257e+08 ±  2%  perf-stat.i.dTLB-stores
   6.455e+09            -8.0%  5.938e+09 ±  3%  perf-stat.i.instructions
       21.78            -8.4%      19.95        perf-stat.i.metric.M/sec
     7045315 ±  4%     -14.0%    6057863 ±  6%  perf-stat.i.node-load-misses
     2658563 ±  7%     -21.9%    2077647 ± 12%  perf-stat.i.node-loads
      414822 ±  4%     -12.9%     361455 ±  3%  perf-stat.i.node-store-misses
   1.251e+09            -7.8%  1.154e+09 ±  3%  perf-stat.ps.branch-instructions
      189082 ±  5%     -10.7%     168849 ±  7%  perf-stat.ps.context-switches
    1.96e+09            -8.8%  1.789e+09 ±  2%  perf-stat.ps.dTLB-loads
   8.912e+08            -8.1%  8.187e+08 ±  2%  perf-stat.ps.dTLB-stores
   6.393e+09            -7.9%  5.888e+09 ±  3%  perf-stat.ps.instructions
     6978485 ±  4%     -13.9%    6006510 ±  6%  perf-stat.ps.node-load-misses
     2633627 ±  7%     -21.8%    2060033 ± 12%  perf-stat.ps.node-loads
      410822 ±  4%     -12.8%     358289 ±  3%  perf-stat.ps.node-store-misses


If you fix the issue, kindly add following tag
| Reported-by: kernel test robot <yujie.liu@xxxxxxxxx>
| Link: https://lore.kernel.org/r/202210081045.77ddf59b-yujie.liu@xxxxxxxxx


To reproduce:

          git clone https://github.com/intel/lkp-tests.git
          cd lkp-tests
          sudo bin/lkp install job.yaml           # job file is attached in this email
          bin/lkp split-job --compatible job.yaml # generate the yaml file for lkp run
          sudo bin/lkp run generated-yaml-file

          # if come across any failure that blocks the test,
          # please remove ~/.lkp and /lkp dir to run from a clean state.


Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.


_______________________________________________
LKP mailing list -- lkp@xxxxxxxxxxxx
To unsubscribe send an email to lkp-leave@xxxxxxxxxxxx

.





[Index of Archives]     [Linux RAID]     [Linux SCSI]     [Linux ATA RAID]     [IDE]     [Linux Wireless]     [Linux Kernel]     [ATH6KL]     [Linux Bluetooth]     [Linux Netdev]     [Kernel Newbies]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Device Mapper]

  Powered by Linux