> -----Original Message----- > From: Ulf Hansson <ulf.hansson@xxxxxxxxxx> > Sent: Monday, February 7, 2022 7:11 PM > To: Ricky WU <ricky_wu@xxxxxxxxxxx> > Cc: tommyhebb@xxxxxxxxx; linux-mmc@xxxxxxxxxxxxxxx; > linux-kernel@xxxxxxxxxxxxxxx > Subject: Re: [PATCH v3] mmc: rtsx: improve performance for multi block rw > > [...] > > > > > > > > > > > > > Do you have any suggestion for testing random I/O But we think > > > > > > random I/O will not change much > > > > > > > > > > I would probably look into using fio, > > > > > https://fio.readthedocs.io/en/latest/ > > > > > > > > > > > > > Filled random I/O data > > > > Before the patch: > > > > CMD (Randread): > > > > sudo fio -filename=/dev/mmcblk0 -direct=1 -numjobs=1 -thread > > > > -group_reporting -ioengine=psync -iodepth=1 -size=1G -name=mytest > > > > -bs=1M -rw=randread > > > > > > Thanks for running the tests! Overall, I would not expect an impact > > > on the throughput when using a big blocksize like 1M. This is also > > > pretty clear from the result you have provided. > > > > > > However, especially for random writes and reads, we want to try with > > > smaller blocksizes. Like 8k or 16k, would you mind running another > > > round of tests to see how that works out? > > > > > > > Filled random I/O data(8k/16k) > > Hi Ricky, > > Apologize for the delay! Thanks for running the tests. Let me comment on > them below. > > > > > Before(randread) > > 8k: > > Cmd: sudo fio -filename=/dev/mmcblk0 -direct=1 -numjobs=1 -thread > > -group_reporting -ioengine=psync -iodepth=1 -size=1G -name=mytest > > -bs=8k -rw=randread > > mytest: (g=0): rw=randread, bs=(R) 8192B-8192B, (W) 8192B-8192B, (T) > > 8192B-8192B, ioengine=psync, iodepth=1 > > result: > > Run status group 0 (all jobs): > > READ: bw=16.5MiB/s (17.3MB/s), 16.5MiB/s-16.5MiB/s > > (17.3MB/s-17.3MB/s), io=1024MiB (1074MB), run=62019-62019msec Disk > stats (read/write): > > mmcblk0: ios=130757/0, merge=0/0, ticks=57751/0, in_queue=57751, > > util=99.89% > > > > 16k: > > Cmd: sudo fio -filename=/dev/mmcblk0 -direct=1 -numjobs=1 -thread > > -group_reporting -ioengine=psync -iodepth=1 -size=1G -name=mytest > > -bs=16k -rw=randread > > mytest: (g=0): rw=randread, bs=(R) 16.0KiB-16.0KiB, (W) > > 16.0KiB-16.0KiB, (T) 16.0KiB-16.0KiB, ioengine=psync, iodepth=1 > > result: > > Run status group 0 (all jobs): > > READ: bw=23.3MiB/s (24.4MB/s), 23.3MiB/s-23.3MiB/s > > (24.4MB/s-24.4MB/s), io=1024MiB (1074MB), run=44034-44034msec Disk > stats (read/write): > > mmcblk0: ios=65333/0, merge=0/0, ticks=39420/0, in_queue=39420, > > util=99.84% > > > > Before(randrwrite) > > 8k: > > Cmd: sudo fio -filename=/dev/mmcblk0 -direct=1 -numjobs=1 -thread > > -group_reporting -ioengine=psync -iodepth=1 -size=100M -name=mytest > > -bs=8k -rw=randwrite > > mytest: (g=0): rw=randwrite, bs=(R) 8192B-8192B, (W) 8192B-8192B, (T) > > 8192B-8192B, ioengine=psync, iodepth=1 > > result: > > Run status group 0 (all jobs): > > WRITE: bw=4060KiB/s (4158kB/s), 4060KiB/s-4060KiB/s > > (4158kB/s-4158kB/s), io=100MiB (105MB), run=25220-25220msec Disk stats > (read/write): > > mmcblk0: ios=51/12759, merge=0/0, ticks=80/24154, in_queue=24234, > > util=99.90% > > > > 16k: > > Cmd: sudo fio -filename=/dev/mmcblk0 -direct=1 -numjobs=1 -thread > > -group_reporting -ioengine=psync -iodepth=1 -size=100M -name=mytest > > -bs=16k -rw=randwrite > > mytest: (g=0): rw=randwrite, bs=(R) 16.0KiB-16.0KiB, (W) > > 16.0KiB-16.0KiB, (T) 16.0KiB-16.0KiB, ioengine=psync, iodepth=1 > > result: > > Run status group 0 (all jobs): > > WRITE: bw=7201KiB/s (7373kB/s), 7201KiB/s-7201KiB/s > > (7373kB/s-7373kB/s), io=100MiB (105MB), run=14221-14221msec Disk stats > (read/write): > > mmcblk0: ios=51/6367, merge=0/0, ticks=82/13647, in_queue=13728, > > util=99.81% > > > > > > After(randread) > > 8k: > > Cmd: sudo fio -filename=/dev/mmcblk0 -direct=1 -numjobs=1 -thread > > -group_reporting -ioengine=psync -iodepth=1 -size=1G -name=mytest > > -bs=8k -rw=randread > > mytest: (g=0): rw=randread, bs=(R) 8192B-8192B, (W) 8192B-8192B, (T) > > 8192B-8192B, ioengine=psync, iodepth=1 > > result: > > Run status group 0 (all jobs): > > READ: bw=12.4MiB/s (13.0MB/s), 12.4MiB/s-12.4MiB/s > > (13.0MB/s-13.0MB/s), io=1024MiB (1074MB), run=82397-82397msec Disk > stats (read/write): > > mmcblk0: ios=130640/0, merge=0/0, ticks=74125/0, in_queue=74125, > > util=99.94% > > > > 16k: > > Cmd: sudo fio -filename=/dev/mmcblk0 -direct=1 -numjobs=1 -thread > > -group_reporting -ioengine=psync -iodepth=1 -size=1G -name=mytest > > -bs=16k -rw=randread > > mytest: (g=0): rw=randread, bs=(R) 16.0KiB-16.0KiB, (W) > > 16.0KiB-16.0KiB, (T) 16.0KiB-16.0KiB, ioengine=psync, iodepth=1 > > result: > > Run status group 0 (all jobs): > > READ: bw=20.0MiB/s (21.0MB/s), 20.0MiB/s-20.0MiB/s > > (21.0MB/s-21.0MB/s), io=1024MiB (1074MB), run=51076-51076msec Disk > stats (read/write): > > mmcblk0: ios=65282/0, merge=0/0, ticks=46255/0, in_queue=46254, > > util=99.87% > > > > After(randwrite) > > 8k: > > Cmd: sudo fio -filename=/dev/mmcblk0 -direct=1 -numjobs=1 -thread > > -group_reporting -ioengine=psync -iodepth=1 -size=100M -name=mytest > > -bs=8k -rw=randwrite > > mytest: (g=0): rw=randwrite, bs=(R) 8192B-8192B, (W) 8192B-8192B, (T) > > 8192B-8192B, ioengine=psync, iodepth=1 > > result: > > Run status group 0 (all jobs): > > WRITE: bw=4215KiB/s (4317kB/s), 4215KiB/s-4215KiB/s > > (4317kB/s-4317kB/s), io=100MiB (105MB), run=24292-24292msec Disk stats > (read/write): > > mmcblk0: ios=52/12717, merge=0/0, ticks=86/23182, in_queue=23267, > > util=99.92% > > > > 16k: > > Cmd: sudo fio -filename=/dev/mmcblk0 -direct=1 -numjobs=1 -thread > > -group_reporting -ioengine=psync -iodepth=1 -size=100M -name=mytest > > -bs=16k -rw=randwrite > > mytest: (g=0): rw=randwrite, bs=(R) 16.0KiB-16.0KiB, (W) > > 16.0KiB-16.0KiB, (T) 16.0KiB-16.0KiB, ioengine=psync, iodepth=1 > > result: > > Run status group 0 (all jobs): > > WRITE: bw=6499KiB/s (6655kB/s), 6499KiB/s-6499KiB/s > > (6655kB/s-6655kB/s), io=100MiB (105MB), run=15756-15756msec Disk stats > (read/write): > > mmcblk0: ios=51/6347, merge=0/0, ticks=84/15120, in_queue=15204, > > util=99.80% > > It looks like the rand-read tests above are degrading with the new changes, > while rand-writes are both improving and degrading. > > To summarize my view from all the tests you have done at this point (thanks a > lot); it looks like the block I/O merging isn't really happening at common > blocklayer, at least to that extent that would benefit us. Clearly you have shown > that by the suggested change in the mmc host driver, by detecting whether the > "next" request is sequential to the previous one, which allows us to skip a > CMD12 and minimize some command overhead. > > However, according to the latest tests above, you have also proved that the > changes in the mmc host driver doesn't come without a cost. > In particular, small random-reads would degrade in performance from these > changes. > > That said, it looks to me that rather than trying to improve things for one > specific mmc host driver, it would be better to look at this from the generic > block layer point of view - and investigate why sequential reads/writes aren't > getting merged often enough for the MMC/SD case. If we can fix the problem > there, all mmc host drivers would benefit I assume. > So you are thinking about how to patch this in MMC/SD? I don't know if this method is compatible with other MMC Hosts? Or they need to patch other code on their host driver > BTW, have you tried with different I/O schedulers? If you haven't tried BFQ, I > suggest you do as it's a good fit for MMC/SD. > I don’t know what is different I/O schedulers means? > [...] > > Kind regards > Uffe > ------Please consider the environment before printing this e-mail.