On Tue, May 10, 2022 at 12:59 PM Tejun Heo <tj@xxxxxxxxxx> wrote: > > Hello, > > On Tue, May 10, 2022 at 12:34:42PM -0700, Yosry Ahmed wrote: > > The rationale behind associating this work with cgroup_subsys is that > > usually the stats are associated with a resource (e.g. memory, cpu, > > etc). For example, if the memory controller is only enabled for a > > subtree in a big hierarchy, it would be more efficient to only run BPF > > rstat programs for those cgroups, not the entire hierarchy. It > > provides a way to control what part of the hierarchy you want to > > collect stats for. This is also semantically similar to the > > css_rstat_flush() callback. > > Hmm... one major point of rstat is not having to worry about these things > because we iterate what's been active rather than what exists. Now, this > isn't entirely true because we share the same updated list for all sources. > This is a trade-off which makes sense because 1. the number of cgroups to > iterate each cycle is generally really low anyway 2. different controllers > often get enabled together. If the balance tilts towards "we're walking too > many due to the sharing of updated list across different sources", the > solution would be splitting the updated list so that we make the walk finer > grained. > > Note that the above doesn't really affect the conceptual model. It's purely > an optimization decision. Tying these things to a cgroup_subsys does affect > the conceptual model and, in this case, the userland API for a performance > consideration which can be solved otherwise. > > So, let's please keep this simple and in the (unlikely) case that the > overhead becomes an issue, solve it from rstat operation side. > > Thanks. I assume if we do this optimization, and have separate updated lists for controllers, we will still have a "core" updated list that is not tied to any controller. Is this correct? If yes, then we can make the interface controller-agnostic (a global list of BPF flushers). If we do the optimization later, we tie BPF stats to the "core" updated list. We can even extend the userland interface then to allow for controller-specific BPF stats if found useful. If not, and there will only be controller-specific updated lists then, then we might need to maintain a "core" updated list just for the sake of BPF programs, which I don't think would be favorable. What do you think? Either-way, I will try to document our discussion outcome in the commit message (and maybe the code), so that if-and-when this optimization is made, we can come back to it. > > -- > tejun