Hi Kuai, Thanks a lot for following up on this! On Mon, Aug 26, 2024 at 9:31 AM Yu Kuai <yukuai1@xxxxxxxxxxxxxxx> wrote: > > Hi, > > 在 2024/08/23 20:05, Lance Yang 写道: > > My bad, I got tied up with some stuff :( > > > > Hmm... tried your debug patch today, but my test results are different from > > yours. So let's take a look at direct IO with raw disk first. > > > > ``` > > $ lsblk > > NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINTS > > sda 8:0 0 90G 0 disk > > ├─sda1 8:1 0 1G 0 part /boot/efi > > └─sda2 8:2 0 88.9G 0 part / > > sdb 8:16 0 10G 0 disk > > > > $ cat /sys/block/sda/queue/scheduler > > none [mq-deadline] > > > > $ cat /sys/block/sda/queue/rotational > > 0 > > > > $ cat /sys/block/sdb/queue/rotational > > 0 > > > > $ cat /sys/block/sdb/queue/scheduler > > none [mq-deadline] > > > > $ cat /boot/config-6.11.0-rc3+ |grep CONFIG_CGROUP_ > > # CONFIG_CGROUP_FAVOR_DYNMODS is not set > > CONFIG_CGROUP_WRITEBACK=y > > CONFIG_CGROUP_SCHED=y > > CONFIG_CGROUP_PIDS=y > > CONFIG_CGROUP_RDMA=y > > CONFIG_CGROUP_FREEZER=y > > CONFIG_CGROUP_HUGETLB=y > > CONFIG_CGROUP_DEVICE=y > > CONFIG_CGROUP_CPUACCT=y > > CONFIG_CGROUP_PERF=y > > CONFIG_CGROUP_BPF=y > > CONFIG_CGROUP_MISC=y > > # CONFIG_CGROUP_DEBUG is not set > > CONFIG_CGROUP_NET_PRIO=y > > CONFIG_CGROUP_NET_CLASSID=y > > > > $ cd /sys/fs/cgroup/test/ && cat cgroup.controllers > > cpu io memory pids > > > > $ cat io.weight > > default 100 > > > > $ cat io.prio.class > > no-change > > ``` > > > > With wiops, the result is as follows: > > > > ``` > > $ echo "8:16 wbps=10485760 wiops=100000" > io.max > > > > $ dd if=/dev/zero of=/dev/sdb bs=50M count=1 oflag=direct > > 1+0 records in > > 1+0 records out > > 52428800 bytes (52 MB, 50 MiB) copied, 5.05893 s, 10.4 MB/s > > > > $ dmesg -T > > [Fri Aug 23 11:04:08 2024] __blk_throtl_bio: bio start 2984 ffff0000fb3a8f00 > > [Fri Aug 23 11:04:08 2024] __blk_throtl_bio: bio start 6176 ffff0000fb3a97c0 > > [Fri Aug 23 11:04:08 2024] __blk_throtl_bio: bio start 7224 ffff0000fb3a9180 > > [Fri Aug 23 11:04:08 2024] __blk_throtl_bio: bio start 16384 ffff0000fb3a8640 > > [Fri Aug 23 11:04:08 2024] __blk_throtl_bio: bio start 16384 ffff0000fb3a9400 > > [Fri Aug 23 11:04:08 2024] __blk_throtl_bio: bio start 16384 ffff0000fb3a8c80 > > [Fri Aug 23 11:04:08 2024] __blk_throtl_bio: bio start 16384 ffff0000fb3a9040 > > [Fri Aug 23 11:04:08 2024] __blk_throtl_bio: bio start 16384 ffff0000fb3a92c0 > > [Fri Aug 23 11:04:08 2024] __blk_throtl_bio: bio start 4096 ffff0000fb3a8000 > > > > > And without wiops, the result is quite different: > > > > ``` > > $ echo "8:16 wbps=10485760 wiops=max" > io.max > > > > $ dd if=/dev/zero of=/dev/sdb bs=50M count=1 oflag=direct > > 1+0 records in > > 1+0 records out > > 52428800 bytes (52 MB, 50 MiB) copied, 5.08187 s, 10.3 MB/s > > > > $ dmesg -T > > [Fri Aug 23 10:59:10 2024] __blk_throtl_bio: bio start 2880 ffff0000c74659c0 > > [Fri Aug 23 10:59:10 2024] __blk_throtl_bio: bio start 6992 ffff00014f621b80 > > [Fri Aug 23 10:59:10 2024] __blk_throtl_bio: bio start 92528 ffff00014f620dc0 > > I don't know why IO size from fs layer is different in this case. Me neither... > > > ``` > > > > Then, I retested for ext4 as you did. > > > > ``` > > $ lsblk > > NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINTS > > sda 8:0 0 90G 0 disk > > ├─sda1 8:1 0 1G 0 part /boot/efi > > └─sda2 8:2 0 88.9G 0 part / > > sdb 8:16 0 10G 0 disk > > > > $ df -T /data > > Filesystem Type 1K-blocks Used Available Use% Mounted on > > /dev/sda2 ext4 91222760 54648704 31894224 64% / > > ``` > > > > With wiops, the result is as follows: > > > > ``` > > $ echo "8:0 wbps=10485760 wiops=100000" > io.max > > > > $ rm -rf /data/file1 && dd if=/dev/zero of=/data/file1 bs=50M count=1 oflag=direct > > 1+0 records in > > 1+0 records out > > 52428800 bytes (52 MB, 50 MiB) copied, 5.06227 s, 10.4 MB/s > > > > $ dmesg -T > > [Fri Aug 23 11:04:08 2024] __blk_throtl_bio: bio start 2984 ffff0000fb3a8f00 > > [Fri Aug 23 11:04:08 2024] __blk_throtl_bio: bio start 6176 ffff0000fb3a97c0 > > [Fri Aug 23 11:04:08 2024] __blk_throtl_bio: bio start 7224 ffff0000fb3a9180 > > [Fri Aug 23 11:04:08 2024] __blk_throtl_bio: bio start 16384 ffff0000fb3a8640 > > [Fri Aug 23 11:04:08 2024] __blk_throtl_bio: bio start 16384 ffff0000fb3a9400 > > [Fri Aug 23 11:04:08 2024] __blk_throtl_bio: bio start 16384 ffff0000fb3a8c80 > > [Fri Aug 23 11:04:08 2024] __blk_throtl_bio: bio start 16384 ffff0000fb3a9040 > > [Fri Aug 23 11:04:08 2024] __blk_throtl_bio: bio start 16384 ffff0000fb3a92c0 > > [Fri Aug 23 11:04:08 2024] __blk_throtl_bio: bio start 4096 ffff0000fb3a8000 > > > > > And without wiops, the result is also quite different: > > > > ``` > > $ echo "8:0 wbps=10485760 wiops=max" > io.max > > > > $ rm -rf /data/file1 && dd if=/dev/zero of=/data/file1 bs=50M count=1 oflag=direct > > 1+0 records in > > 1+0 records out > > 52428800 bytes (52 MB, 50 MiB) copied, 5.03759 s, 10.4 MB/s > > > > $ dmesg -T > > [Fri Aug 23 11:05:07 2024] __blk_throtl_bio: bio start 2904 ffff0000c4e9f2c0 > > [Fri Aug 23 11:05:07 2024] __blk_throtl_bio: bio start 5984 ffff0000c4e9e000 > > [Fri Aug 23 11:05:07 2024] __blk_throtl_bio: bio start 7496 ffff0000c4e9e3c0 > > [Fri Aug 23 11:05:07 2024] __blk_throtl_bio: bio start 16384 ffff0000c4e9eb40 > > [Fri Aug 23 11:05:07 2024] __blk_throtl_bio: bio start 16384 ffff0000c4e9f540 > > [Fri Aug 23 11:05:07 2024] __blk_throtl_bio: bio start 16384 ffff0000c4e9e780 > > [Fri Aug 23 11:05:07 2024] __blk_throtl_bio: bio start 16384 ffff0000c4e9ea00 > > [Fri Aug 23 11:05:07 2024] __blk_throtl_bio: bio start 16384 ffff0000c4e9f900 > > [Fri Aug 23 11:05:07 2024] __blk_throtl_bio: bio start 4096 ffff0000c4e9e8c0 > > While ext4 is the same. And I won't say result is different here. Perhap there is other subtle stuff at play since ext4 is the same? > > [ > > ``` > > > > Hmm... I still hava two questions here: > > 1. Is wbps an average value? > > Yes. > > 2. What's the difference between setting 'max' and setting a very high value for 'wiops'? > > The only difference is that: > > - If there is no iops limit, splited IO will be dispatched directly; > - If there is iops limit, splited IO will be throttled again. iops is > high, however, blk-throtl is FIFO, splited IO will have to wait for > formal request to be throttled by bps first before checking the iops > limit for splited IO. Thanks a lot again for the lesson! Lance > > Thanks, > Kuai > > > > > Thanks a lot again for your time! > > Lance > > . > > >