RE: [PATCH v3] mmc: rtsx: improve performance for multi block rw

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




> -----Original Message-----
> From: Ulf Hansson <ulf.hansson@xxxxxxxxxx>
> Sent: Tuesday, December 28, 2021 10:05 PM
> To: Ricky WU <ricky_wu@xxxxxxxxxxx>
> Cc: tommyhebb@xxxxxxxxx; linux-mmc@xxxxxxxxxxxxxxx;
> linux-kernel@xxxxxxxxxxxxxxx
> Subject: Re: [PATCH v3] mmc: rtsx: improve performance for multi block rw
> 
> On Fri, 24 Dec 2021 at 08:23, Ricky WU <ricky_wu@xxxxxxxxxxx> wrote:
> >
> > > -----Original Message-----
> > > From: Ulf Hansson <ulf.hansson@xxxxxxxxxx>
> > > Sent: Thursday, December 23, 2021 6:37 PM
> > > To: Ricky WU <ricky_wu@xxxxxxxxxxx>
> > > Cc: tommyhebb@xxxxxxxxx; linux-mmc@xxxxxxxxxxxxxxx;
> > > linux-kernel@xxxxxxxxxxxxxxx
> > > Subject: Re: [PATCH v3] mmc: rtsx: improve performance for multi
> > > block rw
> > >
> > > On Thu, 23 Dec 2021 at 11:27, Ricky WU <ricky_wu@xxxxxxxxxxx> wrote:
> > > >
> > > > > -----Original Message-----
> > > > > From: Ulf Hansson <ulf.hansson@xxxxxxxxxx>
> > > > > Sent: Tuesday, December 21, 2021 8:51 PM
> > > > > To: Ricky WU <ricky_wu@xxxxxxxxxxx>
> > > > > Cc: tommyhebb@xxxxxxxxx; linux-mmc@xxxxxxxxxxxxxxx;
> > > > > linux-kernel@xxxxxxxxxxxxxxx
> > > > > Subject: Re: [PATCH v3] mmc: rtsx: improve performance for multi
> > > > > block rw
> > > > >
> > > > > On Tue, 21 Dec 2021 at 13:24, Ricky WU <ricky_wu@xxxxxxxxxxx>
> wrote:
> > > > > >
> > > > > > Improving performance for the CMD is multi-block read/write
> > > > > > and the data is sequential.
> > > > > > sd_check_multi_seq() to distinguish multi-block RW (CMD 18/25)
> > > > > > or normal RW (CMD 17/24) if the CMD is multi-block and the
> > > > > > data is sequential then call to sd_rw_multi_seq()
> > > > > >
> > > > > > This patch mainly to control the timing of reply at CMD 12/13.
> > > > > > Originally code driver reply CMD 12/13 at every RW (CMD 18/25).
> > > > > > The new code to distinguish multi-block RW(CMD 18/25) and the
> > > > > > data is sequential or not, if the data is sequential RW driver
> > > > > > do not send CMD
> > > > > > 12 and bypass CMD 13 until wait the different direction RW CMD
> > > > > > or trigger the delay_work to sent CMD 12.
> > > > > >
> > > > > > run benchmark result as below:
> > > > > > SD Card : Samsumg Pro Plus 128GB Number of Samples:100, Sample
> > > > > > Size:10MB <Before> Read : 86.9 MB/s, Write : 38.3 MB/s <After>
> > > > > > Read : 91.5 MB/s, Write : 55.5 MB/s
> > > > >
> > > > > A much nicer commit message, thanks a lot! Would you mind
> > > > > running some additional tests, like random I/O read/writes?
> > > > >
> > > > > Also, please specify the benchmark tool and command you are using.
> > > > > In the meantime, I will continue to look at the code.
> > > > >
> > > >
> > > > The Tool just use Ubuntu internal GUI benchmark Tool in the "Disks"
> > > > and the Tool don't have random I/O to choice...
> > > >
> > > > Do you have any suggestion for testing random I/O But we think
> > > > random I/O will not change much
> > >
> > > I would probably look into using fio,
> > > https://fio.readthedocs.io/en/latest/
> > >
> >
> > Filled random I/O data
> > Before the patch:
> > CMD (Randread):
> > sudo fio -filename=/dev/mmcblk0 -direct=1 -numjobs=1 -thread
> > -group_reporting -ioengine=psync -iodepth=1 -size=1G -name=mytest
> > -bs=1M -rw=randread
> 
> Thanks for running the tests! Overall, I would not expect an impact on the
> throughput when using a big blocksize like 1M. This is also pretty clear from
> the result you have provided.
> 
> However, especially for random writes and reads, we want to try with smaller
> blocksizes. Like 8k or 16k, would you mind running another round of tests to
> see how that works out?
> 

Filled random I/O data(8k/16k)

Before(randread)
8k:
Cmd: sudo fio -filename=/dev/mmcblk0 -direct=1 -numjobs=1 -thread -group_reporting -ioengine=psync -iodepth=1 -size=1G -name=mytest -bs=8k -rw=randread
mytest: (g=0): rw=randread, bs=(R) 8192B-8192B, (W) 8192B-8192B, (T) 8192B-8192B, ioengine=psync, iodepth=1
result:
Run status group 0 (all jobs):
   READ: bw=16.5MiB/s (17.3MB/s), 16.5MiB/s-16.5MiB/s (17.3MB/s-17.3MB/s), io=1024MiB (1074MB), run=62019-62019msec
Disk stats (read/write):
  mmcblk0: ios=130757/0, merge=0/0, ticks=57751/0, in_queue=57751, util=99.89%

16k:
Cmd: sudo fio -filename=/dev/mmcblk0 -direct=1 -numjobs=1 -thread -group_reporting -ioengine=psync -iodepth=1 -size=1G -name=mytest -bs=16k -rw=randread
mytest: (g=0): rw=randread, bs=(R) 16.0KiB-16.0KiB, (W) 16.0KiB-16.0KiB, (T) 16.0KiB-16.0KiB, ioengine=psync, iodepth=1
result:
Run status group 0 (all jobs):
   READ: bw=23.3MiB/s (24.4MB/s), 23.3MiB/s-23.3MiB/s (24.4MB/s-24.4MB/s), io=1024MiB (1074MB), run=44034-44034msec
Disk stats (read/write):
  mmcblk0: ios=65333/0, merge=0/0, ticks=39420/0, in_queue=39420, util=99.84%

Before(randrwrite)
8k:
Cmd: sudo fio -filename=/dev/mmcblk0 -direct=1 -numjobs=1 -thread -group_reporting -ioengine=psync -iodepth=1 -size=100M -name=mytest -bs=8k -rw=randwrite
mytest: (g=0): rw=randwrite, bs=(R) 8192B-8192B, (W) 8192B-8192B, (T) 8192B-8192B, ioengine=psync, iodepth=1
result:
Run status group 0 (all jobs):
  WRITE: bw=4060KiB/s (4158kB/s), 4060KiB/s-4060KiB/s (4158kB/s-4158kB/s), io=100MiB (105MB), run=25220-25220msec
Disk stats (read/write):
  mmcblk0: ios=51/12759, merge=0/0, ticks=80/24154, in_queue=24234, util=99.90%

16k:
Cmd: sudo fio -filename=/dev/mmcblk0 -direct=1 -numjobs=1 -thread -group_reporting -ioengine=psync -iodepth=1 -size=100M -name=mytest -bs=16k -rw=randwrite
mytest: (g=0): rw=randwrite, bs=(R) 16.0KiB-16.0KiB, (W) 16.0KiB-16.0KiB, (T) 16.0KiB-16.0KiB, ioengine=psync, iodepth=1
result:
Run status group 0 (all jobs):
  WRITE: bw=7201KiB/s (7373kB/s), 7201KiB/s-7201KiB/s (7373kB/s-7373kB/s), io=100MiB (105MB), run=14221-14221msec
Disk stats (read/write):
  mmcblk0: ios=51/6367, merge=0/0, ticks=82/13647, in_queue=13728, util=99.81%


After(randread)
8k:
Cmd: sudo fio -filename=/dev/mmcblk0 -direct=1 -numjobs=1 -thread -group_reporting -ioengine=psync -iodepth=1 -size=1G -name=mytest -bs=8k -rw=randread
mytest: (g=0): rw=randread, bs=(R) 8192B-8192B, (W) 8192B-8192B, (T) 8192B-8192B, ioengine=psync, iodepth=1
result:
Run status group 0 (all jobs):
   READ: bw=12.4MiB/s (13.0MB/s), 12.4MiB/s-12.4MiB/s (13.0MB/s-13.0MB/s), io=1024MiB (1074MB), run=82397-82397msec
Disk stats (read/write):
  mmcblk0: ios=130640/0, merge=0/0, ticks=74125/0, in_queue=74125, util=99.94%

16k:
Cmd: sudo fio -filename=/dev/mmcblk0 -direct=1 -numjobs=1 -thread -group_reporting -ioengine=psync -iodepth=1 -size=1G -name=mytest -bs=16k -rw=randread
mytest: (g=0): rw=randread, bs=(R) 16.0KiB-16.0KiB, (W) 16.0KiB-16.0KiB, (T) 16.0KiB-16.0KiB, ioengine=psync, iodepth=1
result:
Run status group 0 (all jobs):
   READ: bw=20.0MiB/s (21.0MB/s), 20.0MiB/s-20.0MiB/s (21.0MB/s-21.0MB/s), io=1024MiB (1074MB), run=51076-51076msec
Disk stats (read/write):
  mmcblk0: ios=65282/0, merge=0/0, ticks=46255/0, in_queue=46254, util=99.87%

After(randwrite)
8k:
Cmd: sudo fio -filename=/dev/mmcblk0 -direct=1 -numjobs=1 -thread -group_reporting -ioengine=psync -iodepth=1 -size=100M -name=mytest -bs=8k -rw=randwrite
mytest: (g=0): rw=randwrite, bs=(R) 8192B-8192B, (W) 8192B-8192B, (T) 8192B-8192B, ioengine=psync, iodepth=1
result:
Run status group 0 (all jobs):
  WRITE: bw=4215KiB/s (4317kB/s), 4215KiB/s-4215KiB/s (4317kB/s-4317kB/s), io=100MiB (105MB), run=24292-24292msec
Disk stats (read/write):
  mmcblk0: ios=52/12717, merge=0/0, ticks=86/23182, in_queue=23267, util=99.92%

16k:
Cmd: sudo fio -filename=/dev/mmcblk0 -direct=1 -numjobs=1 -thread -group_reporting -ioengine=psync -iodepth=1 -size=100M -name=mytest -bs=16k -rw=randwrite
mytest: (g=0): rw=randwrite, bs=(R) 16.0KiB-16.0KiB, (W) 16.0KiB-16.0KiB, (T) 16.0KiB-16.0KiB, ioengine=psync, iodepth=1
result:
Run status group 0 (all jobs):
  WRITE: bw=6499KiB/s (6655kB/s), 6499KiB/s-6499KiB/s (6655kB/s-6655kB/s), io=100MiB (105MB), run=15756-15756msec
Disk stats (read/write):
  mmcblk0: ios=51/6347, merge=0/0, ticks=84/15120, in_queue=15204, util=99.80%

> I haven't yet been able to provide you with comments on the code, but I am
> looking into it.
> 
> Kind regards
> Uffe
> 
> >
> > mytest: (g=0): rw=randread, bs=(R) 1024KiB-1024KiB, (W)
> > 1024KiB-1024KiB, (T) 1024KiB-1024KiB, ioengine=psync, iodepth=1
> > fio-3.16
> > Starting 1 thread
> > Jobs: 1 (f=1): [r(1)][100.0%][r=86.0MiB/s][r=86 IOPS][eta 00m:00s]
> > mytest: (groupid=0, jobs=1): err= 0: pid=2663: Fri Dec 24 14:28:33 2021
> >   read: IOPS=85, BW=85.1MiB/s (89.3MB/s)(1024MiB/12026msec)
> >     clat (usec): min=11253, max=34579, avg=11735.57, stdev=742.16
> >      lat (usec): min=11254, max=34580, avg=11736.34, stdev=742.16
> >     clat percentiles (usec):
> >      |  1.00th=[11338],  5.00th=[11469], 10.00th=[11600],
> 20.00th=[11600],
> >      | 30.00th=[11600], 40.00th=[11600], 50.00th=[11731],
> 60.00th=[11731],
> >      | 70.00th=[11863], 80.00th=[11863], 90.00th=[11863],
> 95.00th=[11863],
> >      | 99.00th=[11863], 99.50th=[12518], 99.90th=[15664],
> 99.95th=[34341],
> >      | 99.99th=[34341]
> >    bw (  KiB/s): min=81920, max=88064, per=99.91%, avg=87110.67,
> stdev=1467.81, samples=24
> >    iops        : min=   80, max=   86, avg=85.00, stdev= 1.41,
> samples=24
> >   lat (msec)   : 20=99.90%, 50=0.10%
> >   cpu          : usr=0.17%, sys=1.26%, ctx=2048, majf=0, minf=256
> >   IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%,
> >=64=0.0%
> >      submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%,
> >=64=0.0%
> >      complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%,
> >=64=0.0%
> >      issued rwts: total=1024,0,0,0 short=0,0,0,0 dropped=0,0,0,0
> >      latency   : target=0, window=0, percentile=100.00%, depth=1
> >
> > Run status group 0 (all jobs):
> >    READ: bw=85.1MiB/s (89.3MB/s), 85.1MiB/s-85.1MiB/s
> > (89.3MB/s-89.3MB/s), io=1024MiB (1074MB), run=12026-12026msec
> >
> > Disk stats (read/write):
> >   mmcblk0: ios=2026/0, merge=0/0, ticks=17612/0, in_queue=17612,
> > util=99.23%
> >
> > CMD (Randwrite):
> > sudo fio -filename=/dev/mmcblk0 -direct=1 -numjobs=1 -thread
> > -group_reporting -ioengine=psync -iodepth=1 -size=1G -name=mytest
> > -bs=1M -rw=randwrite
> >
> > mytest: (g=0): rw=randwrite, bs=(R) 1024KiB-1024KiB, (W)
> > 1024KiB-1024KiB, (T) 1024KiB-1024KiB, ioengine=psync, iodepth=1
> > fio-3.16
> > Starting 1 thread
> > Jobs: 1 (f=1): [w(1)][100.0%][w=41.0MiB/s][w=41 IOPS][eta 00m:00s]
> > mytest: (groupid=0, jobs=1): err= 0: pid=2738: Fri Dec 24 14:30:05 2021
> >   write: IOPS=38, BW=38.4MiB/s (40.2MB/s)(1024MiB/26695msec); 0 zone
> resets
> >     clat (usec): min=18862, max=94708, avg=25990.34, stdev=9227.22
> >      lat (usec): min=18910, max=94781, avg=26061.91, stdev=9228.04
> >     clat percentiles (usec):
> >      |  1.00th=[20579],  5.00th=[22414], 10.00th=[22676],
> 20.00th=[22938],
> >      | 30.00th=[23200], 40.00th=[23462], 50.00th=[23462],
> 60.00th=[23725],
> >      | 70.00th=[23725], 80.00th=[23987], 90.00th=[24773],
> 95.00th=[56361],
> >      | 99.00th=[59507], 99.50th=[64226], 99.90th=[86508],
> 99.95th=[94897],
> >      | 99.99th=[94897]
> >    bw (  KiB/s): min=24576, max=43008, per=99.85%, avg=39221.13,
> stdev=3860.74, samples=53
> >    iops        : min=   24, max=   42, avg=38.30, stdev= 3.77,
> samples=53
> >   lat (msec)   : 20=0.98%, 50=92.38%, 100=6.64%
> >   cpu          : usr=0.50%, sys=0.31%, ctx=1024, majf=0, minf=0
> >   IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%,
> >=64=0.0%
> >      submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%,
> >=64=0.0%
> >      complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%,
> >=64=0.0%
> >      issued rwts: total=0,1024,0,0 short=0,0,0,0 dropped=0,0,0,0
> >      latency   : target=0, window=0, percentile=100.00%, depth=1
> >
> > Run status group 0 (all jobs):
> >   WRITE: bw=38.4MiB/s (40.2MB/s), 38.4MiB/s-38.4MiB/s
> > (40.2MB/s-40.2MB/s), io=1024MiB (1074MB), run=26695-26695msec
> >
> > Disk stats (read/write):
> >   mmcblk0: ios=52/2043, merge=0/0, ticks=81/39874, in_queue=39956,
> > util=99.90%
> >
> >
> > After the patch:
> >
> > CMD (Randread):
> > sudo fio -filename=/dev/mmcblk0 -direct=1 -numjobs=1 -thread
> > -group_reporting -ioengine=psync -iodepth=1 -size=1G -name=mytest
> > -bs=1M -rw=randread
> >
> > mytest: (g=0): rw=randread, bs=(R) 1024KiB-1024KiB, (W)
> > 1024KiB-1024KiB, (T) 1024KiB-1024KiB, ioengine=psync, iodepth=1
> > fio-3.16
> > Starting 1 thread
> > Jobs: 1 (f=1): [r(1)][100.0%][r=87.0MiB/s][r=87 IOPS][eta 00m:00s]
> > mytest: (groupid=0, jobs=1): err= 0: pid=11614: Fri Dec 24 14:07:06 2021
> >   read: IOPS=86, BW=86.6MiB/s (90.8MB/s)(1024MiB/11828msec)
> >     clat (usec): min=11068, max=32423, avg=11543.12, stdev=733.86
> >      lat (usec): min=11069, max=32424, avg=11543.85, stdev=733.87
> >     clat percentiles (usec):
> >      |  1.00th=[11076],  5.00th=[11338], 10.00th=[11469],
> 20.00th=[11469],
> >      | 30.00th=[11469], 40.00th=[11469], 50.00th=[11469],
> 60.00th=[11600],
> >      | 70.00th=[11600], 80.00th=[11600], 90.00th=[11600],
> 95.00th=[11600],
> >      | 99.00th=[11600], 99.50th=[11731], 99.90th=[21627],
> 99.95th=[32375],
> >      | 99.99th=[32375]
> >    bw (  KiB/s): min=83968, max=90112, per=99.94%, avg=88598.26,
> stdev=1410.46, samples=23
> >    iops        : min=   82, max=   88, avg=86.52, stdev= 1.38,
> samples=23
> >   lat (msec)   : 20=99.80%, 50=0.20%
> >   cpu          : usr=0.09%, sys=1.40%, ctx=2048, majf=0, minf=256
> >   IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%,
> >=64=0.0%
> >      submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%,
> >=64=0.0%
> >      complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%,
> >=64=0.0%
> >      issued rwts: total=1024,0,0,0 short=0,0,0,0 dropped=0,0,0,0
> >      latency   : target=0, window=0, percentile=100.00%, depth=1
> >
> > Run status group 0 (all jobs):
> >    READ: bw=86.6MiB/s (90.8MB/s), 86.6MiB/s-86.6MiB/s
> > (90.8MB/s-90.8MB/s), io=1024MiB (1074MB), run=11828-11828msec
> >
> > Disk stats (read/write):
> >   mmcblk0: ios=2016/0, merge=0/0, ticks=17397/0, in_queue=17397,
> > util=99.21%
> >
> > CMD (Randwrite):
> > sudo fio -filename=/dev/mmcblk0 -direct=1 -numjobs=1 -thread
> > -group_reporting -ioengine=psync -iodepth=1 -size=1G -name=mytest
> > -bs=1M -rw=randwrite
> >
> > mytest: (g=0): rw=randwrite, bs=(R) 1024KiB-1024KiB, (W)
> > 1024KiB-1024KiB, (T) 1024KiB-1024KiB, ioengine=psync, iodepth=1
> > fio-3.16
> > Starting 1 thread
> > Jobs: 1 (f=1): [w(1)][100.0%][w=50.0MiB/s][w=50 IOPS][eta 00m:00s]
> > mytest: (groupid=0, jobs=1): err= 0: pid=11668: Fri Dec 24 14:08:36 2021
> >   write: IOPS=39, BW=39.3MiB/s (41.2MB/s)(1024MiB/26059msec); 0 zone
> resets
> >     clat (msec): min=16, max=118, avg=25.37, stdev=16.34
> >      lat (msec): min=16, max=118, avg=25.44, stdev=16.34
> >     clat percentiles (msec):
> >      |  1.00th=[   17],  5.00th=[   20], 10.00th=[   20],
> 20.00th=[   20],
> >      | 30.00th=[   20], 40.00th=[   20], 50.00th=[   20],
> 60.00th=[   20],
> >      | 70.00th=[   21], 80.00th=[   21], 90.00th=[   52],
> 95.00th=[   75],
> >      | 99.00th=[   78], 99.50th=[  104], 99.90th=[  114],
> 99.95th=[  120],
> >      | 99.99th=[  120]
> >    bw (  KiB/s): min=20480, max=51200, per=99.93%, avg=40211.69,
> stdev=10498.00, samples=52
> >    iops        : min=   20, max=   50, avg=39.27, stdev=10.25,
> samples=52
> >   lat (msec)   : 20=72.95%, 50=16.80%, 100=9.57%, 250=0.68%
> >   cpu          : usr=0.41%, sys=0.38%, ctx=1024, majf=0, minf=0
> >   IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%,
> >=64=0.0%
> >      submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%,
> >=64=0.0%
> >      complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%,
> >=64=0.0%
> >      issued rwts: total=0,1024,0,0 short=0,0,0,0 dropped=0,0,0,0
> >      latency   : target=0, window=0, percentile=100.00%, depth=1
> >
> > Run status group 0 (all jobs):
> >   WRITE: bw=39.3MiB/s (41.2MB/s), 39.3MiB/s-39.3MiB/s
> > (41.2MB/s-41.2MB/s), io=1024MiB (1074MB), run=26059-26059msec
> >
> > Disk stats (read/write):
> >   mmcblk0: ios=51/2031, merge=0/0, ticks=84/40061, in_queue=40144,
> > util=99.89%
> >
> > BR,
> > Ricky
> ------Please consider the environment before printing this e-mail.




[Index of Archives]     [Linux Memonry Technology]     [Linux USB Devel]     [Linux Media]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux