On Mon, Apr 08, 2024 at 07:55:38AM -1000, Tejun Heo wrote: > Hello, > > On Fri, Apr 05, 2024 at 07:05:41PM +0200, Michal Koutný wrote: > > Currently, when pids.max limit is breached in the hierarchy, the event > > is counted and reported in the cgroup where the forking task resides. > > > > This decouples the limit and the notification caused by the limit making > > it hard to detect when the actual limit was effected. > > > > Let's introduce new events: > > max > > The number of times the limit of the cgroup was hit. > > > > max.imposed > > The number of times fork failed in the cgroup because of self > > or ancestor limit. > > The whole series make sense to me. I'm not sure about max.imposed field > name. Maybe a name which clearly signfies rejection of forks would be > clearer? Johannes, what do you think? The max event at the level where the limit is set (and up, for hierarchical accounting) makes sense to me. max.imposed is conceptually not entirely unprecedented, but something we've tried to avoid. Usually the idea is that events correspond to specific cgroup limitations at that level. Failures due to constraints higher up could be from anything, including system-level shortages. IOW, events are supposed to be more about "how many times did this limit here trigger", and less about "how many times did something happen to the tasks local to this group". It's a bit arbitrary and not perfectly followed everywhere, but I think there is value in trying to maintain that distinction, so that somebody looking at those files doesn't have to rack their brains or look up every counter in the docs to figure out what it's tracking. It's at least true for the misc controller, and for most of memcg - with the weird exception of the swap.max events which we've tried to fix before... For "things that are happening to the tasks in this group", would it make more sense to have an e.g. pids.stat::forkfail instead? (Or just not have that event at all? I'm not sure if it's actually needed or whether you kept it only to maintain some form of the information that is currently provided by the pr_info()).