On Mon, Nov 21, 2022 at 2:19 PM Michael Wu <michael@xxxxxxxxxxxxxxxxx> wrote: > > On 11/18/2022 7:43 PM, Wenchao Chen wrote: > > On Fri, Nov 18, 2022 at 1:52 PM Michael Wu <michael@xxxxxxxxxxxxxxxxx> wrote: > >> > >> Current next_tag selection will cause a large delay in some requests and > >> destroy the scheduling results of the block scheduling layer. Because the > >> issued mrq tags cannot ensure that each time is sequential, especially when > >> the IO load is heavy. In the fio performance test, we found that 4k random > >> read data was sent to mmc_hsq to start calling request_atomic It takes > >> nearly 200ms to process the request, while mmc_hsq has processed thousands > >> of other requests. So we use fifo here to ensure the first in, first out > >> feature of the request and avoid adding additional delay to the request. > >> > > > > Hi Michael > > Is the test device an eMMC? > > Could you share the fio test command if you want? > > Can you provide more logs? > > > Hi Wenchao, > Yes, the tested device is emmc. > The test command we used is `./fio -name=Rand_Read_IOPS_Test > -group_reporting -rw=random -bs=4K -numjobs=8 -directory=/data/data > -size=1G -io_size=64M -nrfiles=1 -direct=1 -thread && rm > /data/Rand_Read_IOPS_Test *`, which replaces the io performance random > read performance test of androidbench, and the file size is set to 1G, 8 > thread test configuration. Where /data uses f2fs and /data/data is a > file encrypted path. > > After enabling the hsq configuration, we can clearly see from below fio > test log that the minimum value of random reading is 3175 iops and the > maximum value is 8554iops, and the maximum delay of io completion is > about 200ms. > ``` > clat percentiles (usec): > | 1.00th=[ 498], 5.00th=[ 865], 10.00th=[ 963], 20.00th=[ > 1045], > | 30.00th=[ 1090], 40.00th=[ 1139], 50.00th=[ 1172], 60.00th=[ > 1221], > | 70.00th=[ 1254], 80.00th=[ 1319], 90.00th=[ 1401], 95.00th=[ > 1614], > | 99.00th=[ 2769], 99.50th=[ 3589], 99.90th=[ 31589], 99.95th=[ > 66323], > | 99.99th=[200279] > bw ( KiB/s): min=12705, max=34225, per=100.00%, avg=23931.79, > stdev=497.40, samples=345 > iops : min= 3175, max= 8554, avg=5981.67, stdev=124.38, > samples=345 > ``` > > > ``` > clat percentiles (usec): > | 1.00th=[ 799], 5.00th=[ 938], 10.00th=[ 963], 20.00th=[ 979], > | 30.00th=[ 996], 40.00th=[ 1004], 50.00th=[ 1020], 60.00th=[ 1045], > | 70.00th=[ 1074], 80.00th=[ 1106], 90.00th=[ 1172], 95.00th=[ 1237], > | 99.00th=[ 1450], 99.50th=[ 1516], 99.90th=[ 1762], 99.95th=[ 2180], > | 99.99th=[ 9503] > bw ( KiB/s): min=29200, max=30944, per=100.00%, avg=30178.91, > stdev=53.45, samples=272 > iops : min= 7300, max= 7736, avg=7544.62, stdev=13.38, > samples=272 > ``` > When NOT enabling hsq, the minimum value of random reading is 7300 iops > and the maximum value is 7736 iops, and the maximum delay of io is only > 9 ms. Finally, we added debug to the mmc driver. The reason for locating > the 200ms delay of hsq is due to the next tag selection of hsq. > Thank you very much for your Log. This patch can reduce latency, but I have some questions: 1. FIO -rw does not have random, but it does have randread. Do you use randread? In addition, "IO_SIZE=64M" means only 64M data is tested? Refer to FIO: https://fio.readthedocs.io/en/latest/fio_doc.html?highlight=io_size#cmdoption-arg-io-size 2. The style of "tag_tail" should remain the same as that of "next_tag". Would "tail_tag" be better? 3. It is better to provide a comparison of sequential read, sequential write and random write. > --- > Michael Wu