Hello, Michal. Sorry about the delay. On Tue, May 09, 2023 at 12:34:53PM +0200, Michal Koutný wrote: > On Fri, May 05, 2023 at 08:17:10AM -1000, Tejun Heo <tj@xxxxxxxxxx> wrote: > > On Fri, May 05, 2023 at 07:32:40PM +0200, Michal Koutný wrote: > > > On Fri, May 05, 2023 at 05:45:58AM -1000, Tejun Heo <tj@xxxxxxxxxx> wrote: > > > > > There are three relevant nodes for each cgroupfs entry: > > > > > > > > > > R ... cgroup hierarchy root > > > > > M ... mount root > > > > > C ... reader's cgroup NS root > > > > > > > > > > mountinfo is supposed to show path from C to M. > > > > > > > > At least for cgroup2, the path from C to M isn't gonna change once NS is > > > > established, right? > > > > > > Right. Although, the argument about M (when C above M or when C and M in > > > different subtrees) implicitly relies on the namespace_sem. > > > > I don't follow. Can you please elaborate a bit more? > > I wanted to say that even with restriction to cgroup2, the css_set_lock > removal would also rely on namespace_sem. > > For a given mountinfo entry the path C--M won't change (no renames). > The question is whether cgroup M will stay around (with the relaxed > locking): > > - C >= M (C is below M) > -> C (transitively) pins M Yeah, this was what I was thinking. > - C < M (C is above M) or C and M are in two disjoint subtrees (path > goes through a common ancestor) > -> M could be released without relation to C (even on cgroup2, with > the css_set_lock removed) but such a destructive operation on M > is excluded as long as namespace_sem is held during entry > rendering. > > Does that clarify the trade-off of removing css_set_lock at this spot? Right, you can have cgroup outside NS root still mounted and that mount root can be viewed from multiple cgroup NS's, so the the path isn't fixed either. Having enough lockdep annotations should do but if reasonable the preference is being a bit more self-contained. Thanks for the explanation. -- tejun