On Thu, Jul 25, 2024 at 1:43 PM Johannes Weiner <hannes@xxxxxxxxxxx> wrote: > > On Wed, Jul 24, 2024 at 07:02:11PM +0000, Kinsey Ho wrote: > > To obtain the pointer to the saved memcg position, mem_cgroup_iter() > > currently holds css->refcnt during memcg traversal only to put > > css->refcnt at the end of the routine. This isn't necessary as an > > rcu_read_lock is already held throughout the function. > > > > Remove css->refcnt usage during traversal by leveraging RCU. > > Eh, I don't know about this. > > RCU ensures that the css memory isn't freed. > > The tryget ensures that the css is still alive and valid. > > In this case, it just so happens that the sibling linkage is also rcu > protected. But accessing random css members when the refcount is 0 is > kind of sketchy. On the other hand, the refcount is guaranteed to be > valid, and rcu + tryget is a common pattern. To be fair, the documentation of css_next_descendant_pre() mentions that the requirements are: - Either cgroup_mutex or RCU lock is held. - Both @pos and @root are accessible. - @pos is a descendant of @root. This reads to me like it is intentional that RCU protection is enough for @pos and @root, and that the sibling linkage is RCU protected by design. Perhaps we could clarify this further (whether at css_next_descendant_pre(), or above the definition of the linkage members). > > What does this buy us? The tryget is cheap. mem_cgroup_iter() is not an easy function to follow, so I personally appreciate the simplicity gains tbh.