On Tue, May 18, 2021 at 11:08 AM Shakeel Butt <shakeelb@xxxxxxxxxx> wrote: > > On Mon, May 17, 2021 at 7:02 PM Suren Baghdasaryan <surenb@xxxxxxxxxx> wrote: > > > > PSI accounts stalls for each cgroup separately and aggregates it at each > > level of the hierarchy. This causes additional overhead with psi_avgs_work > > being called for each cgroup in the hierarchy. psi_avgs_work has been > > highly optimized, however on systems with large number of cgroups the > > overhead becomes noticeable. > > Systems which use PSI only at the system level could avoid this overhead > > if PSI can be configured to skip per-cgroup stall accounting. > > Add "cgroup_disable=pressure" kernel command-line option to allow > > requesting system-wide only pressure stall accounting. When set, it > > keeps system-wide accounting under /proc/pressure/ but skips accounting > > for individual cgroups and does not expose PSI nodes in cgroup hierarchy. > > > > Signed-off-by: Suren Baghdasaryan <surenb@xxxxxxxxxx> > > I am assuming that this is for Android and at the moment Android is > only interested in system level pressure. I am wondering if there is > any plan for Android to have cgroup hierarchies with explicit limits > in future? Correct and yes, we would like to use memcgs to limit memory in the future, however we do not plan on using per-cgroup psi so far. > > If yes, then I think we should follow up (this patch is fine > independently) with making this feature more general by explicitly > enabling psi for each cgroup level similar to how we enable > controllers through cgroup.subtree_control. > > Something like: > > $ echo "+psi" > cgroup.subtree_control > > This definitely would be helpful for server use cases where jobs do > sub-containers but might not be interested in psi but the admin is > interested in the top level job's psi. Haven't thought about it before but that makes sense to me.