Hello all, Sorry to bother you, we encountered a problem related to the memcg dirty throttle after migrating from cgroup v1 to v2, so want to ask for some comments or suggestions. 1. Problem We have the "containerd" service running under system.slice, with its memory.max set to 5GB. It will be constantly throttled in the balance_dirty_pages() since the memcg has dirty memory more than the memcg dirty thresh. We haven't this problem on cgroup v1, because cgroup v1 doesn't have the per-memcg writeback and per-memcg dirty thresh. Only the global dirty thresh will be checked in balance_dirty_pages(). 2. Thinking So we wonder if we can support the per-memcg dirty thresh interface? Now the memcg dirty thresh is just calculated from memcg max * ratio, which can be set from /proc/sys/vm/dirty_ratio. We have to set it to 60 instead of the default 20 to workaround now, but worry about the potential side effects. If we can support the per-memcg dirty thresh interface, we can set some containers to a much higher dirty_ratio, especially for hungry dirtier workloads like "containerd". 3. Solution? But we could't think of a good solution to support this. The current memcg dirty thresh is calculated from a complex rule: memcg dirty thresh = memcg avail * dirty_ratio memcg avail is from combination of: memcg max/high, memcg files and capped by system-wide clean memory excluding the amount being used in the memcg. Although we may find a way to calculate the per-memcg dirty thresh, we can't use it directly, since we still need to calculate/distribute dirty thresh to the per-wb dirty thresh share. R - A - B \-- C For example, if we know the dirty thresh of A, but wb is in C, we have no way to distribute the dirty thresh shares to the wb in C. But we have to get the dirty thresh of the wb in C, since we need it to control throttling process of the wb in balance_dirty_pages(). I may have missed something above, but the problem seems clear IMHO. Looking forward to any comment or suggestion. Thanks!