Re: CFQ idling kills I/O performance on ext4 with blkio cgroup controller

"Srivatsa S. Bhat" <srivatsa@xxxxxxxxxxxxx> · Mon, 20 May 2019 15:45:46 -0700

On 5/20/19 3:19 AM, Paolo Valente wrote:
> 
> 
>> Il giorno 18 mag 2019, alle ore 22:50, Srivatsa S. Bhat <srivatsa@xxxxxxxxxxxxx> ha scritto:
>>
>> On 5/18/19 11:39 AM, Paolo Valente wrote:
>>> I've addressed these issues in my last batch of improvements for BFQ,
>>> which landed in the upcoming 5.2. If you give it a try, and still see
>>> the problem, then I'll be glad to reproduce it, and hopefully fix it
>>> for you.
>>>
>>
>> Hi Paolo,
>>
>> Thank you for looking into this!
>>
>> I just tried current mainline at commit 72cf0b07, but unfortunately
>> didn't see any improvement:
>>
>> dd if=/dev/zero of=/root/test.img bs=512 count=10000 oflag=dsync
>>
>> With mq-deadline, I get:
>>
>> 5120000 bytes (5.1 MB, 4.9 MiB) copied, 3.90981 s, 1.3 MB/s
>>
>> With bfq, I get:
>> 5120000 bytes (5.1 MB, 4.9 MiB) copied, 84.8216 s, 60.4 kB/s
>>
> 
> Hi Srivatsa,
> thanks for reproducing this on mainline.  I seem to have reproduced a
> bonsai-tree version of this issue.  Before digging into the block
> trace, I'd like to ask you for some feedback.
> 
> First, in my test, the total throughput of the disk happens to be
> about 20 times as high as that enjoyed by dd, regardless of the I/O
> scheduler.  I guess this massive overhead is normal with dsync, but
> I'd like know whether it is about the same on your side.  This will
> help me understand whether I'll actually be analyzing about the same
> problem as yours.
> 

Do you mean to say the throughput obtained by dd'ing directly to the
block device (bypassing the filesystem)? That does give me a 20x
speedup with bs=512, but much more with a bigger block size (achieving
a max throughput of about 110 MB/s).

dd if=/dev/zero of=/dev/sdc bs=512 count=10000 conv=fsync
10000+0 records in
10000+0 records out
5120000 bytes (5.1 MB, 4.9 MiB) copied, 0.15257 s, 33.6 MB/s

dd if=/dev/zero of=/dev/sdc bs=4k count=10000 conv=fsync
10000+0 records in
10000+0 records out
40960000 bytes (41 MB, 39 MiB) copied, 0.395081 s, 104 MB/s

I'm testing this on a Toshiba MG03ACA1 (1TB) hard disk.

> Second, the commands I used follow.  Do they implement your test case
> correctly?
> 
> [root@localhost tmp]# mkdir /sys/fs/cgroup/blkio/testgrp
> [root@localhost tmp]# echo $BASHPID > /sys/fs/cgroup/blkio/testgrp/cgroup.procs
> [root@localhost tmp]# cat /sys/block/sda/queue/scheduler
> [mq-deadline] bfq none
> [root@localhost tmp]# dd if=/dev/zero of=/root/test.img bs=512 count=10000 oflag=dsync
> 10000+0 record dentro
> 10000+0 record fuori
> 5120000 bytes (5,1 MB, 4,9 MiB) copied, 14,6892 s, 349 kB/s
> [root@localhost tmp]# echo bfq > /sys/block/sda/queue/scheduler
> [root@localhost tmp]# dd if=/dev/zero of=/root/test.img bs=512 count=10000 oflag=dsync
> 10000+0 record dentro
> 10000+0 record fuori
> 5120000 bytes (5,1 MB, 4,9 MiB) copied, 20,1953 s, 254 kB/s
> 

Yes, this is indeed the testcase, although I see a much bigger
drop in performance with bfq, compared to the results from
your setup.

Regards,
Srivatsa