On Mon, Oct 02, 2023 at 03:03:48PM -0700, Roman Gushchin wrote: > On Mon, Oct 02, 2023 at 04:12:54PM -0400, Johannes Weiner wrote: > > On Wed, Sep 27, 2023 at 08:08:29AM -0700, Roman Gushchin wrote: > > > @@ -3001,6 +3001,47 @@ static struct obj_cgroup *__get_obj_cgroup_from_memcg(struct mem_cgroup *memcg) > > > return objcg; > > > } > > > > > > +static DEFINE_SPINLOCK(current_objcg_lock); > > > + > > > +static struct obj_cgroup *current_objcg_update(struct obj_cgroup *old) > > > +{ > > > + struct mem_cgroup *memcg; > > > + struct obj_cgroup *objcg; > > > + unsigned long flags; > > > + > > > + old = current_objcg_clear_update_flag(old); > > > + if (old) > > > + obj_cgroup_put(old); > > > + > > > + spin_lock_irqsave(¤t_objcg_lock, flags); > > > + rcu_read_lock(); > > > + memcg = mem_cgroup_from_task(current); > > > + for (; memcg != root_mem_cgroup; memcg = parent_mem_cgroup(memcg)) { > > > + objcg = rcu_dereference(memcg->objcg); > > > + if (objcg && obj_cgroup_tryget(objcg)) > > > + break; > > > + objcg = NULL; > > > + } > > > + rcu_read_unlock(); > > > > Can this tryget() actually fail when this is called on the current > > task during fork() and attach()? A cgroup cannot be offlined while > > there is a task in it. > > Highly theoretically it can if it races against a migration of the current > task to another memcg and the previous memcg is getting offlined. Ah right, if this runs between css_set_move_task() and ->attach(). The cache would be briefly updated to a parent in the old hierarchy, but then quickly reset from the ->attach(). Can you please add a comment along these lines? > I actually might make sense to apply the same approach for memcgs as well > (saving a lazily-updating memcg pointer on task_struct). Then it will be > possible to ditch this "for" loop. But I need some time to master the code > and run benchmarks. Idk if it will make enough difference to justify the change. Yeah the memcg pointer is slightly less attractive from an optimization POV because it already is a pretty direct pointer from task through the cset array. If you still want to look into it from a simplification POV that sounds reasonable, but IMO it would be fine with a comment. > > > @@ -6345,6 +6393,22 @@ static void mem_cgroup_move_task(void) > > > mem_cgroup_clear_mc(); > > > } > > > } > > > + > > > +#ifdef CONFIG_MEMCG_KMEM > > > +static void mem_cgroup_fork(struct task_struct *task) > > > +{ > > > + task->objcg = (struct obj_cgroup *)0x1; > > > > dup_task_struct() will copy this pointer from the old task. Would it > > be possible to bump the refcount here instead? That would save quite a > > bit of work during fork(). > > Yeah, it should be possible. It won't save a lot, but I agree it makes > sense. I'll take a look and will prepare a separate patch for this. I guess the hairiest part would be synchronizing against a migration because all these cgroup core callbacks are unlocked. Would it make sense to add ->fork_locked() and ->attach_locked() callbacks that are dispatched under the css_set_lock? Then this could be a simple if (p && !(p & 0x1)) obj_cgroup_get(), which would certainly be nice to workloads where fork() is hot, with little downside otherwise.