On Sat, Nov 23, 2024 at 09:38:09AM +0100, Vlastimil Babka wrote: > On 11/23/24 7:09 AM, Shakeel Butt wrote: > > We are starting to deploy mmap_lock tracepoint monitoring across our > > fleet and the early results showed that these tracepoints are consuming > > significant amount of CPUs in kernfs_path_from_node when enabled. > > > > It seems like the kernel is trying to resolved the cgroup path in the > > fast path of the locking code path when the tracepoints are enabled. In > > addition for some application their metrics are regressing when > > monitoring is enabled. > > > > The cgroup path resolution can be slow and should not be done in the > > fast path. Most userspace tools, like bpftrace, provides functionality > > to get the cgroup path from cgroup id, so let's just trace the cgroup > > id and the users can use better tools to get the path in the slow path. > > > > Signed-off-by: Shakeel Butt <shakeel.butt@xxxxxxxxx> > > AFAIU this would also remove the lockdep issue that patch [1] is solving > with RCU conversion. It probably has other benefits on its own too, so > just FYI. It's definitely better to avoid complex operations to gather > tracepoint data, if avoidable. > > [1] https://lore.kernel.org/all/20241121175250.EJbI7VMb@xxxxxxxxxxxxx/ > Thanks for the pointer, I might add a reference to this in the commit message in next version.