Re: [patch 014/173] mm, tracing: record slab name for kmem_cache_free()

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, 24 Feb 2021 12:00:55 -0800
Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx> wrote:

> From: Jacob Wen <jian.w.wen@xxxxxxxxxx>
> Subject: mm, tracing: record slab name for kmem_cache_free()
> 
> Currently, a trace record generated by the RCU core is as below.
> 
> ... kmem_cache_free: call_site=rcu_core+0x1fd/0x610 ptr=00000000f3b49a66
> 
> It doesn't tell us what the RCU core has freed.
> 
> This patch adds the slab name to trace_kmem_cache_free().
> The new format is as follows.
> 
> ... kmem_cache_free: call_site=rcu_core+0x1fd/0x610 ptr=0000000037f79c8d name=dentry
> ... kmem_cache_free: call_site=rcu_core+0x1fd/0x610 ptr=00000000f78cb7b5 name=sock_inode_cache
> ... kmem_cache_free: call_site=rcu_core+0x1fd/0x610 ptr=0000000018768985 name=pool_workqueue
> ... kmem_cache_free: call_site=rcu_core+0x1fd/0x610 ptr=000000006a6cb484 name=radix_tree_node
> 
> We can use it to understand what the RCU core is going to free. For
> example, some users maybe interested in when the RCU core starts
> freeing reclaimable slabs like dentry to reduce memory pressure.
> 
> Link: https://lkml.kernel.org/r/20201216072804.8838-1-jian.w.wen@xxxxxxxxxx
> Signed-off-by: Jacob Wen <jian.w.wen@xxxxxxxxxx>
> Cc: Christoph Lameter <cl@xxxxxxxxx>
> Cc: Pekka Enberg <penberg@xxxxxxxxxx>
> Cc: David Rientjes <rientjes@xxxxxxxxxx>
> Cc: Joonsoo Kim <iamjoonsoo.kim@xxxxxxx>
> Cc: Steven Rostedt <rostedt@xxxxxxxxxxx>
> Cc: "Paul E. McKenney" <paulmck@xxxxxxxxxxxxxxxxxx>
> Signed-off-by: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx>
> ---
> 
>  include/trace/events/kmem.h |   24 ++++++++++++++++--------
>  mm/slab.c                   |    2 +-
>  mm/slob.c                   |    2 +-
>  mm/slub.c                   |    2 +-
>  4 files changed, 19 insertions(+), 11 deletions(-)
> 
> --- a/include/trace/events/kmem.h~mm-tracing-record-slab-name-for-kmem_cache_free
> +++ a/include/trace/events/kmem.h
> @@ -115,7 +115,7 @@ DEFINE_EVENT(kmem_alloc_node, kmem_cache
>  	TP_ARGS(call_site, ptr, bytes_req, bytes_alloc, gfp_flags, node)
>  );
>  
> -DECLARE_EVENT_CLASS(kmem_free,
> +TRACE_EVENT(kfree,
>  
>  	TP_PROTO(unsigned long call_site, const void *ptr),
>  
> @@ -135,18 +135,26 @@ DECLARE_EVENT_CLASS(kmem_free,
>  		  (void *)__entry->call_site, __entry->ptr)
>  );
>  
> -DEFINE_EVENT(kmem_free, kfree,
> +TRACE_EVENT(kmem_cache_free,
>  
> -	TP_PROTO(unsigned long call_site, const void *ptr),
> +	TP_PROTO(unsigned long call_site, const void *ptr, const char *name),
>  
> -	TP_ARGS(call_site, ptr)
> -);
> +	TP_ARGS(call_site, ptr, name),
>  
> -DEFINE_EVENT(kmem_free, kmem_cache_free,
> +	TP_STRUCT__entry(
> +		__field(	unsigned long,	call_site	)
> +		__field(	const void *,	ptr		)
> +		__field(	const char *,	name		)

You must use __string() here!

		__string(name, name);

> +	),
>  
> -	TP_PROTO(unsigned long call_site, const void *ptr),
> +	TP_fast_assign(
> +		__entry->call_site	= call_site;
> +		__entry->ptr		= ptr;
> +		__entry->name		= name;

This must be:

		__assign_str(name, name);


> +	),
>  
> -	TP_ARGS(call_site, ptr)
> +	TP_printk("call_site=%pS ptr=%p name=%s",
> +		  (void *)__entry->call_site, __entry->ptr, __entry->name)

	You must use __get_str(name) here!

(see other users of this logic in include/trace/events/*.h)

What is happening is that TP_fast_assign() is called by the tracepoint
logic (at the time of the event), then much later (seconds, minutes,
hours, days!), when the user does a "cat trace" of the file, the
__entry->name is read and the printf logic is called against it. Well,
the s->name that __entry->name points to, can be long gone by then!

Instead, using __string() tells the TRACE_EVENT() macro that this is a
dynamic string. The __assign_str() records the string into the ring
buffer. The __get_str() retrieves the string from the ring buffer as
part of the event, so it stays around as long as the event being read
by the trace file is around.

Please do not apply this patch as is, it is very buggy!

-- Steve


>  );
>  
>  TRACE_EVENT(mm_page_free,
> --- a/mm/slab.c~mm-tracing-record-slab-name-for-kmem_cache_free
> +++ a/mm/slab.c
> @@ -3717,7 +3717,7 @@ void kmem_cache_free(struct kmem_cache *
>  	__cache_free(cachep, objp, _RET_IP_);
>  	local_irq_restore(flags);
>  
> -	trace_kmem_cache_free(_RET_IP_, objp);
> +	trace_kmem_cache_free(_RET_IP_, objp, cachep->name);
>  }
>  EXPORT_SYMBOL(kmem_cache_free);
>  
> --- a/mm/slob.c~mm-tracing-record-slab-name-for-kmem_cache_free
> +++ a/mm/slob.c
> @@ -673,7 +673,7 @@ void kmem_cache_free(struct kmem_cache *
>  		__kmem_cache_free(b, c->size);
>  	}
>  
> -	trace_kmem_cache_free(_RET_IP_, b);
> +	trace_kmem_cache_free(_RET_IP_, b, c->name);
>  }
>  EXPORT_SYMBOL(kmem_cache_free);
>  
> --- a/mm/slub.c~mm-tracing-record-slab-name-for-kmem_cache_free
> +++ a/mm/slub.c
> @@ -3157,7 +3157,7 @@ void kmem_cache_free(struct kmem_cache *
>  	if (!s)
>  		return;
>  	slab_free(s, virt_to_head_page(x), x, NULL, 1, _RET_IP_);
> -	trace_kmem_cache_free(_RET_IP_, x);
> +	trace_kmem_cache_free(_RET_IP_, x, s->name);
>  }
>  EXPORT_SYMBOL(kmem_cache_free);
>  
> _





[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux