Re: [BUG] cgroupv2/blk: inconsistent I/O behavior in Cgroup v2 with set device wbps and wiops

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

在 2024/08/12 23:43, Michal Koutný 写道:
+Cc Kuai

On Mon, Aug 12, 2024 at 11:00:30PM GMT, Lance Yang <ioworker0@xxxxxxxxx> wrote:
Hi all,

I've run into a problem with Cgroup v2 where it doesn't seem to correctly limit
I/O operations when I set both wbps and wiops for a device. However, if I only
set wbps, then everything works as expected.

To reproduce the problem, we can follow these command-based steps:

1. **System Information:**
    - Kernel Version and OS Release:
      ```
      $ uname -r
      6.10.0-rc5+

      $ cat /etc/os-release
      PRETTY_NAME="Ubuntu 24.04 LTS"
      NAME="Ubuntu"
      VERSION_ID="24.04"
      VERSION="24.04 LTS (Noble Numbat)"
      VERSION_CODENAME=noble
      ID=ubuntu
      ID_LIKE=debian
      HOME_URL="https://www.ubuntu.com/";
      SUPPORT_URL="https://help.ubuntu.com/";
      BUG_REPORT_URL="https://bugs.launchpad.net/ubuntu/";
      PRIVACY_POLICY_URL="https://www.ubuntu.com/legal/terms-and-policies/privacy-policy";
      UBUNTU_CODENAME=noble
      LOGO=ubuntu-logo
      ```

2. **Device Information and Settings:**
    - List Block Devices and Scheduler:
      ```
      $ lsblk
      NAME    MAJ:MIN RM  SIZE RO TYPE MOUNTPOINTS
      sda     8:0    0   4.4T  0 disk
      └─sda1  8:1    0   4.4T  0 part /data
      ...

      $ cat /sys/block/sda/queue/scheduler
      none [mq-deadline] kyber bfq

      $ cat /sys/block/sda/queue/rotational
      1
      ```

3. **Reproducing the problem:**
    - Navigate to the cgroup v2 filesystem and configure I/O settings:
      ```
      $ cd /sys/fs/cgroup/
      $ stat -fc %T /sys/fs/cgroup
      cgroup2fs
      $ mkdir test
      $ echo "8:0 wbps=10485760 wiops=100000" > io.max
      ```
      In this setup:
      wbps=10485760 sets the write bytes per second limit to 10 MB/s.
      wiops=100000 sets the write I/O operations per second limit to 100,000.

    - Add process to the cgroup and verify:
      ```
      $ echo $$ > cgroup.procs
      $ cat cgroup.procs
      3826771
      3828513
      $ ps -ef|grep 3826771
      root     3826771 3826768  0 22:04 pts/1    00:00:00 -bash
      root     3828761 3826771  0 22:06 pts/1    00:00:00 ps -ef
      root     3828762 3826771  0 22:06 pts/1    00:00:00 grep --color=auto 3826771
      ```

    - Observe I/O performance using `dd` commands and `iostat`:
      ```
      $ dd if=/dev/zero of=/data/file1 bs=512M count=1 &
      $ dd if=/dev/zero of=/data/file1 bs=512M count=1 &

You're testing buffer IO here, and I don't see that write back cgroup is
enabled. Is this test intentional? Why not test direct IO?
      ```
      ```
      $ iostat -d 1 -h -y -p sda
tps kB_read/s kB_wrtn/s kB_dscd/s kB_read kB_wrtn kB_dscd Device
      7.00         0.0k         1.3M         0.0k       0.0k       1.3M       0.0k sda
      7.00         0.0k         1.3M         0.0k       0.0k       1.3M       0.0k sda1


       tps    kB_read/s    kB_wrtn/s    kB_dscd/s    kB_read    kB_wrtn    kB_dscd Device
      5.00         0.0k         1.2M         0.0k       0.0k       1.2M       0.0k sda
      5.00         0.0k         1.2M         0.0k       0.0k       1.2M       0.0k sda1


       tps    kB_read/s    kB_wrtn/s    kB_dscd/s    kB_read    kB_wrtn    kB_dscd Device
     21.00         0.0k         1.4M         0.0k       0.0k       1.4M       0.0k sda
     21.00         0.0k         1.4M         0.0k       0.0k       1.4M       0.0k sda1


       tps    kB_read/s    kB_wrtn/s    kB_dscd/s    kB_read    kB_wrtn    kB_dscd Device
      5.00         0.0k         1.2M         0.0k       0.0k       1.2M       0.0k sda
      5.00         0.0k         1.2M         0.0k       0.0k       1.2M       0.0k sda1


       tps    kB_read/s    kB_wrtn/s    kB_dscd/s    kB_read    kB_wrtn    kB_dscd Device
      5.00         0.0k         1.2M         0.0k       0.0k       1.2M       0.0k sda
      5.00         0.0k         1.2M         0.0k       0.0k       1.2M       0.0k sda1


       tps    kB_read/s    kB_wrtn/s    kB_dscd/s    kB_read    kB_wrtn    kB_dscd Device
   1848.00         0.0k       448.1M         0.0k       0.0k     448.1M       0.0k sda
   1848.00         0.0k       448.1M         0.0k       0.0k     448.1M       0.0k sda1

Looks like all dirty buffer got flushed to disk at the last second while
the file is closed, this is expected.
      ```
Initially, the write speed is slow (<2MB/s) then suddenly bursts to several
hundreds of MB/s.

What it would be on average?
IOW how long would the whole operation in throttled cgroup take?


    - Testing with wiops set to max:
      ```
      echo "8:0 wbps=10485760 wiops=max" > io.max
      $ dd if=/dev/zero of=/data/file1 bs=512M count=1 &
      $ dd if=/dev/zero of=/data/file1 bs=512M count=1 &
      ```
      ```
      $ iostat -d 1 -h -y -p sda

       tps    kB_read/s    kB_wrtn/s    kB_dscd/s    kB_read    kB_wrtn    kB_dscd Device
     48.00         0.0k        10.0M         0.0k       0.0k      10.0M       0.0k sda
     48.00         0.0k        10.0M         0.0k       0.0k      10.0M       0.0k sda1


       tps    kB_read/s    kB_wrtn/s    kB_dscd/s    kB_read    kB_wrtn    kB_dscd Device
     40.00         0.0k        10.0M         0.0k       0.0k      10.0M       0.0k sda
     40.00         0.0k        10.0M         0.0k       0.0k      10.0M       0.0k sda1


       tps    kB_read/s    kB_wrtn/s    kB_dscd/s    kB_read    kB_wrtn    kB_dscd Device
     41.00         0.0k        10.0M         0.0k       0.0k      10.0M       0.0k sda
     41.00         0.0k        10.0M         0.0k       0.0k      10.0M       0.0k sda1


       tps    kB_read/s    kB_wrtn/s    kB_dscd/s    kB_read    kB_wrtn    kB_dscd Device
     46.00         0.0k        10.0M         0.0k       0.0k      10.0M       0.0k sda
     46.00         0.0k        10.0M         0.0k       0.0k      10.0M       0.0k sda1


       tps    kB_read/s    kB_wrtn/s    kB_dscd/s    kB_read    kB_wrtn    kB_dscd Device
     55.00         0.0k        10.2M         0.0k       0.0k      10.2M       0.0k sda
     55.00         0.0k        10.2M         0.0k       0.0k      10.2M       0.0k sda1

And I don't this wiops=max is the reason, what need to explain is that
why dirty buffer got flushed to disk synchronously before the dd finish
and close the file?

      ```
The iostat output shows the write operations as stabilizing at around 10 MB/s,
which aligns with the defined limit of 10 MB/s. After setting wiops to max, the
I/O limits appear to work as expected.

Can you give the direct IO a test? And also enable write back cgroup for
buffer IO.

Thanks,
Kuai



Thanks,
Lance

Thanks for the report Lance. Is this something you started seeing after
a kernel update or switch to cgroup v2? (Or you simply noticed with this
setup only?)


Michal






[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Security]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]     [Monitors]

  Powered by Linux