Re: [RFC PATCH 0/9 v2] cgroup: separate per-subsystem rstat trees

Shakeel Butt <shakeel.butt@xxxxxxxxx> · Mon, 13 Jan 2025 10:39:02 -0800

On Fri, Jan 03, 2025 at 12:08:39PM -1000, Tejun Heo wrote:
> Hello,
> 
> On Thu, Jan 02, 2025 at 05:50:11PM -0800, JP Kobryn wrote:
> ...
> > I reached a point where this started to feel stable in my local testing, so I
> > wanted to share and get feedback on this approach.
> 
> The rationale for using one tree to track all subsystems was that if one
> subsys has been active (e.g. memory), it's likely that other subsyses have
> been active too (e.g. cpu) and thus we might as well flush the whole thing
> together. The approach can be useful for reducing the amount of work done
> when e.g. there are a lot of cgroups which are only active periodically but
> has drawbacks when one subsystem's stats are read a lot more actively than
> others as you pointed out.

I wanted to add two more points to above: (1) One subsystem (memory) has
in-kernel stats consumer with strict latency/performance requirement and
(2) the flush cost of memory stats have drastically increased due to
more than 100 stats it has to maintain.

> 
> Intuitions go only so far and it's difficult to judge whether splitting the
> trees would be a good idea without data. Can you please provide some
> numbers along with rationales for the test setups?

Here I think the supportive data we can show is the (1) non-memory stats
readers not needing to spend time on memory stats flushing and (2) with
per-subsystem update tree, have we increased the cost of update tree
insertion in general?

Anything else you think will be needed?

Thanks Tejun for taking a look.