On Fri, Nov 22, 2024 at 10:46:53PM -0800, Yosry Ahmed wrote: > On Fri, Nov 22, 2024 at 10:10 PM Shakeel Butt <shakeel.butt@xxxxxxxxx> wrote: > > > > We are starting to deploy mmap_lock tracepoint monitoring across our > > fleet and the early results showed that these tracepoints are consuming > > significant amount of CPUs in kernfs_path_from_node when enabled. > > > > It seems like the kernel is trying to resolved the cgroup path in the > > s/resolved/resolve > > > fast path of the locking code path when the tracepoints are enabled. In > > addition for some application their metrics are regressing when > > monitoring is enabled. > > > > The cgroup path resolution can be slow and should not be done in the > > fast path. Most userspace tools, like bpftrace, provides functionality > > to get the cgroup path from cgroup id, so let's just trace the cgroup > > id and the users can use better tools to get the path in the slow path. > > > > Signed-off-by: Shakeel Butt <shakeel.butt@xxxxxxxxx> > > --- > > include/linux/memcontrol.h | 18 ++++++++++++ > > include/trace/events/mmap_lock.h | 32 ++++++++++---------- > > mm/mmap_lock.c | 50 ++------------------------------ > > 3 files changed, 36 insertions(+), 64 deletions(-) > > > > diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h > > index 5502aa8e138e..d82f08cd70cd 100644 > > --- a/include/linux/memcontrol.h > > +++ b/include/linux/memcontrol.h > > @@ -1046,6 +1046,19 @@ static inline void memcg_memory_event_mm(struct mm_struct *mm, > > > > void split_page_memcg(struct page *head, int old_order, int new_order); > > > > +static inline u64 memcg_id_from_mm(struct mm_struct *mm) > > The usage of memcg_id here and throughout the patch is a bit confusing > because we have a member called 'id' in struct mem_cgroup, but this > isn't it. This is the cgroup_id of the memcg. I admit it's hard to > distinguish them during naming, but when I first saw the function I > thought it was returning memcg->id. > > Maybe just cgroup_id_from_mm()? In cgroup v2, the cgroup id is the > same regardless of the controller anyway, in cgroup v1, it's kinda > natural that we return the cgroup id of the memcg. > > I don't feel strongly, but I prefer that we use clearer naming, and > either way a comment may help clarify things. > Ack, I will change to cgroup_id_from_mm() but I will keep memcg_id in the tracepoints. > > +{ > > + struct mem_cgroup *memcg; > > + u64 id = 0; > > + > > + rcu_read_lock(); > > + memcg = mem_cgroup_from_task(rcu_dereference(mm->owner)); > > + if (likely(memcg)) > > + id = cgroup_id(memcg->css.cgroup); > > We return 0 if the memcg is NULL here, shouldn't we return the cgroup > id of the root memcg instead? This is more consistent with > get_mem_cgroup_from_mm(), and makes sure we always return the id of a > valid cgroup. Good point and I need to add a mem_cgroup_disabled() check as well. Will do in v2.