Hi,
在 2024/08/12 23:43, Michal Koutný 写道:
+Cc Kuai
On Mon, Aug 12, 2024 at 11:00:30PM GMT, Lance Yang <ioworker0@xxxxxxxxx> wrote:
Hi all,
I've run into a problem with Cgroup v2 where it doesn't seem to correctly limit
I/O operations when I set both wbps and wiops for a device. However, if I only
set wbps, then everything works as expected.
To reproduce the problem, we can follow these command-based steps:
1. **System Information:**
- Kernel Version and OS Release:
```
$ uname -r
6.10.0-rc5+
$ cat /etc/os-release
PRETTY_NAME="Ubuntu 24.04 LTS"
NAME="Ubuntu"
VERSION_ID="24.04"
VERSION="24.04 LTS (Noble Numbat)"
VERSION_CODENAME=noble
ID=ubuntu
ID_LIKE=debian
HOME_URL="https://www.ubuntu.com/"
SUPPORT_URL="https://help.ubuntu.com/"
BUG_REPORT_URL="https://bugs.launchpad.net/ubuntu/"
PRIVACY_POLICY_URL="https://www.ubuntu.com/legal/terms-and-policies/privacy-policy"
UBUNTU_CODENAME=noble
LOGO=ubuntu-logo
```
2. **Device Information and Settings:**
- List Block Devices and Scheduler:
```
$ lsblk
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINTS
sda 8:0 0 4.4T 0 disk
└─sda1 8:1 0 4.4T 0 part /data
...
$ cat /sys/block/sda/queue/scheduler
none [mq-deadline] kyber bfq
$ cat /sys/block/sda/queue/rotational
1
```
3. **Reproducing the problem:**
- Navigate to the cgroup v2 filesystem and configure I/O settings:
```
$ cd /sys/fs/cgroup/
$ stat -fc %T /sys/fs/cgroup
cgroup2fs
$ mkdir test
$ echo "8:0 wbps=10485760 wiops=100000" > io.max
```
In this setup:
wbps=10485760 sets the write bytes per second limit to 10 MB/s.
wiops=100000 sets the write I/O operations per second limit to 100,000.
- Add process to the cgroup and verify:
```
$ echo $$ > cgroup.procs
$ cat cgroup.procs
3826771
3828513
$ ps -ef|grep 3826771
root 3826771 3826768 0 22:04 pts/1 00:00:00 -bash
root 3828761 3826771 0 22:06 pts/1 00:00:00 ps -ef
root 3828762 3826771 0 22:06 pts/1 00:00:00 grep --color=auto 3826771
```
- Observe I/O performance using `dd` commands and `iostat`:
```
$ dd if=/dev/zero of=/data/file1 bs=512M count=1 &
$ dd if=/dev/zero of=/data/file1 bs=512M count=1 &
You're testing buffer IO here, and I don't see that write back cgroup is
enabled. Is this test intentional? Why not test direct IO?
```
```
$ iostat -d 1 -h -y -p sda
tps kB_read/s kB_wrtn/s kB_dscd/s kB_read kB_wrtn kB_dscd Device
7.00 0.0k 1.3M 0.0k 0.0k 1.3M 0.0k sda
7.00 0.0k 1.3M 0.0k 0.0k 1.3M 0.0k sda1
tps kB_read/s kB_wrtn/s kB_dscd/s kB_read kB_wrtn kB_dscd Device
5.00 0.0k 1.2M 0.0k 0.0k 1.2M 0.0k sda
5.00 0.0k 1.2M 0.0k 0.0k 1.2M 0.0k sda1
tps kB_read/s kB_wrtn/s kB_dscd/s kB_read kB_wrtn kB_dscd Device
21.00 0.0k 1.4M 0.0k 0.0k 1.4M 0.0k sda
21.00 0.0k 1.4M 0.0k 0.0k 1.4M 0.0k sda1
tps kB_read/s kB_wrtn/s kB_dscd/s kB_read kB_wrtn kB_dscd Device
5.00 0.0k 1.2M 0.0k 0.0k 1.2M 0.0k sda
5.00 0.0k 1.2M 0.0k 0.0k 1.2M 0.0k sda1
tps kB_read/s kB_wrtn/s kB_dscd/s kB_read kB_wrtn kB_dscd Device
5.00 0.0k 1.2M 0.0k 0.0k 1.2M 0.0k sda
5.00 0.0k 1.2M 0.0k 0.0k 1.2M 0.0k sda1
tps kB_read/s kB_wrtn/s kB_dscd/s kB_read kB_wrtn kB_dscd Device
1848.00 0.0k 448.1M 0.0k 0.0k 448.1M 0.0k sda
1848.00 0.0k 448.1M 0.0k 0.0k 448.1M 0.0k sda1
Looks like all dirty buffer got flushed to disk at the last second while
the file is closed, this is expected.
```
Initially, the write speed is slow (<2MB/s) then suddenly bursts to several
hundreds of MB/s.
What it would be on average?
IOW how long would the whole operation in throttled cgroup take?
- Testing with wiops set to max:
```
echo "8:0 wbps=10485760 wiops=max" > io.max
$ dd if=/dev/zero of=/data/file1 bs=512M count=1 &
$ dd if=/dev/zero of=/data/file1 bs=512M count=1 &
```
```
$ iostat -d 1 -h -y -p sda
tps kB_read/s kB_wrtn/s kB_dscd/s kB_read kB_wrtn kB_dscd Device
48.00 0.0k 10.0M 0.0k 0.0k 10.0M 0.0k sda
48.00 0.0k 10.0M 0.0k 0.0k 10.0M 0.0k sda1
tps kB_read/s kB_wrtn/s kB_dscd/s kB_read kB_wrtn kB_dscd Device
40.00 0.0k 10.0M 0.0k 0.0k 10.0M 0.0k sda
40.00 0.0k 10.0M 0.0k 0.0k 10.0M 0.0k sda1
tps kB_read/s kB_wrtn/s kB_dscd/s kB_read kB_wrtn kB_dscd Device
41.00 0.0k 10.0M 0.0k 0.0k 10.0M 0.0k sda
41.00 0.0k 10.0M 0.0k 0.0k 10.0M 0.0k sda1
tps kB_read/s kB_wrtn/s kB_dscd/s kB_read kB_wrtn kB_dscd Device
46.00 0.0k 10.0M 0.0k 0.0k 10.0M 0.0k sda
46.00 0.0k 10.0M 0.0k 0.0k 10.0M 0.0k sda1
tps kB_read/s kB_wrtn/s kB_dscd/s kB_read kB_wrtn kB_dscd Device
55.00 0.0k 10.2M 0.0k 0.0k 10.2M 0.0k sda
55.00 0.0k 10.2M 0.0k 0.0k 10.2M 0.0k sda1
And I don't this wiops=max is the reason, what need to explain is that
why dirty buffer got flushed to disk synchronously before the dd finish
and close the file?
```
The iostat output shows the write operations as stabilizing at around 10 MB/s,
which aligns with the defined limit of 10 MB/s. After setting wiops to max, the
I/O limits appear to work as expected.
Can you give the direct IO a test? And also enable write back cgroup for
buffer IO.
Thanks,
Kuai
Thanks,
Lance
Thanks for the report Lance. Is this something you started seeing after
a kernel update or switch to cgroup v2? (Or you simply noticed with this
setup only?)
Michal